Literature DB >> 30505898

Association model data learned from clinicians stratified by patient mortality outcomes at a Tertiary Academic Center.

Jason K Wang¹, Jason Hom^2,3, Santhosh Balasubramanian², Jonathan H Chen^2,4,3.

Abstract

In this data article, we learn clinical order patterns from inpatient electronic health record (EHR) data at a tertiary academic center from three different cohorts of providers: (1) Clinicians with lower-than-expected patient mortality rates, (2) clinicians with higher-than-expected patient mortality rates, and (3) an unfiltered population of clinicians. We extract and make public these order patterns learned from each clinician cohort associated with six common admission diagnoses (e.g. pneumonia, chest pain, etc.). We also share a reusable reference standard or benchmark for evaluating automatically-learned clinical order patterns for each admission diagnosis, based on a manual review of clinical practice literature. The data shared in this article can support further study, evaluation, and translation of data-driven CDS systems. Further interpretation and discussion of this data can be found in Wang et al. (2018).

Entities: Chemical Disease Species

Year: 2018 PMID： 30505898 PMCID： PMC6247447 DOI： 10.1016/j.dib.2018.10.163

Source DB: PubMed Journal: Data Brief ISSN： 2352-3409

Specifications table Value of the data Association model data can enable investigation of medical decision making patterns associated with low-mortality and high-mortality clinicians. Association model data can streamline the manual order set curation process by providing committees with real-time information regarding which orders are most commonly associated with which admission diagnoses. Association model data can be used to train or prototype data-driven clinical decision support tools that can ultimately provide clinicians with point-of-care guidance [3], [4], [5]. Reference order lists curated from published practice guideline literature can serve as reference standards or benchmarks for evaluating automatically-learned clinical order patterns.

Data

Here we stratify clinicians in a tertiary academic hospital into “low-mortality” and “high-mortality” subgroups based on observed vs. expected patient mortality rates. We then train three distinct association models using clinical order data generated by the “low-mortality” and “high-mortality” clinician populations as well as an unfiltered crowd of all clinicians. We provide association data (Appendix A–F, see Table 1 for overview of association data spreadsheets) learned from each clinician cohort for six common admission diagnoses: Altered mental status (ICD9: 780.97), chest pain (ICD9: 786.5), gastrointestinal (GI) hemorrhage (ICD9: 578), heart failure (ICD9: 428), pneumonia (ICD9: 486), and syncope and collapse (ICD9: 780.2).

Table 1

Association model output overview.

Feature	Description
clinical_item_id	Unique clinical order identifier
description	Clinical order name or description
score	Negative log of the P-value computed by Yates’ chi-squared statistic
PPV	Positive predictive value: nAB/nA
PPV_95CI_low/high	Positive predictive value 95% confidence intervals
OR	Odds ratio: (nAB/nB)/[(nA-nAB)/(N-nB)]
OR_95CI_low/high	Odds ratio 95% confidence intervals
prevalence	Prevalence: nB/N
prevalence_95CI_low/high	Prevalence 95% confidence interval
RR	Relative risk: (nAB/nA)/[(nB−nAB)/(N−nA)]
RR_95CI_low/high	Relative risk 95% confidence intervals
P_YatesChi2	P-value computed by Yates’ chi-squared statistic
N	Number of times any clinical order co-occurred within 24 h of the given admission diagnosis order
nAB	Number of times the specified clinical order co-occurred within 24 h of the given admission diagnosis order
nA	Number of times the admission diagnosis order occurred in general
nB	Number of times the specified clinical order occurred in general

Association model output overview. In this data article, we also share practice guideline-based reference standards that can be used to evaluate the “correctness” of automatically learned clinical order patterns (Appendix G, see Table 2 for overview of reference standard spreadsheet). These lists of reference orders were manually curated by physicians reviewing clinical practice literature for each admission diagnosis.

Table 2

Guideline reference standard overview.

Feature	Description
icd9	Admission diagnosis classification code
admission_diagnosis	Admission diagnosis description
clinical_item_id	Unique clinical order identifier
category	Type of order (e.g. lab test, procedure, medication, etc.)
description	Clinical order name or description

Guideline reference standard overview.

Experimental design, materials, and methods

Data source and preparation

We extracted deidentified, structured patient data from the (Epic) EHR for inpatient hospitalizations from 2008–2013 via the Stanford University Medical Center (SUMC) Clinical Data Warehouse [6]. Patient data was pre-processed to reduce complexity [8] across medication [7], lab result, and diagnosis coding. A complete description of data preparation can be found in Section 3.1 and 3.2 of Wang et al. [1].

Clinician stratification and patient cohort assembly

Clinicians who saw patients between 2010–2013 (n=1,822) were stratified into low-mortality (21.8%, n=397) and high-mortality (6.0%, n=110) extremes using a two-sided P-value score quantifying deviation of observed vs. expected 30-day patient mortality rates. Expected per-patient mortality probabilities were predicted for patients seen in 2010–2013 based on 2008–2009 patient and mortality data (see Section 3.3 of Wang et al. [1] for full-length discussion of clinician stratification methodology). Defining physician-patient attribution using History and Physical Examination notes signed upon admission, three patient cohorts were assembled: Patients seen by low-mortality clinicians, high-mortality clinicians, and an unfiltered crowd of all clinicians. After balancing covariates between patient populations using common-referent 1:1:K propensity score matching [10], we obtained cohorts of size 1,046, 1,046, and 5,230 patients, respectively (see reference [1] for pre- and post-matching covariate distributions).

Association rule episode mining

We trained three distinct association models using patient encounters from the balanced low-mortality, high-mortality, and crowd patient cohorts, each reflecting clinical order patterns from the corresponding clinician population. We then generated order lists (Appendix A–F) from each association model for the six aforementioned admission diagnoses. Further discussion of association model (“clinical recommender engine”) training can be found in Section 3.6 of Wang et al. [1] and additional reading [9], [11], [12], [13], [14]. To assess similarity among predicted order lists, we can calculate agreement by Rank Biased Overlap [15], which accounts for rank-order (Table 3).

Table 3

Rank Biased Overlap (RBO) computed between each pair of predicted order lists, score-ranked by PPV for the six example admission diagnoses. RBO computes the average fraction of top items in common between two order lists, geometrically weighting all ~2.0 K candidate clinical order items, and ranges from 0.0 (no correlation or random list order) to 1.0 (perfect agreement). RBO is characterized by a “persistence” parameter p, the probability that an observer reviewing the top k items will continue to observe the (k+1)-th items. For our calculations, we used a default implementation parameter p of 0.98. This has the effect of geometrically weighting emphasis to the top of each list. RBO values of ~0.7 indicate strong overlap between order lists generated by two cohorts.

Rank biased overlap
Diagnosis	Low-mortality vs. high-mortality	Low-mortality vs. crowd	High-mortality vs. crowd
Altered mental status (780.97)	0.64	0.79	0.64
Chest pain (786.5)	0.64	0.77	0.70
Gastrointestinal hemorrhage (578)	0.65	0.74	0.67
Heart failure (428)	0.58	0.67	0.55
Pneumonia (486)	0.66	0.71	0.67
Syncope and collapse (780.2)	0.61	0.68	0.63

Practice guideline reference standard

We can evaluate each predicted order list against clinical practice guidelines as a proxy for “good” medical decision making (see Results of reference [1]). Two board-certified internal medicine physicians curated reference lists of clinical orders based on published clinical practice literature sourced from the National Guideline Clearinghouse (www.guideline.gov) and PubMed. After independently curating their lists, the two physicians resolved disagreements (items included in one physician׳s list but not the other) by consensus to produce a final reference standard for each admission diagnosis. In this data article, we make available reference standards for the six aforementioned admission diagnoses (Appendix G). To assess pre-consensus agreement between the two clinicians, we computed Cohen׳s Kappa statistics (Table 4).

Table 4

Cohen׳s Kappa values to assess pre-consensus agreement between reference standards independently curated by two board-certified clinicians from clinical practice guidelines. Values range from −1 to +1, with values <0 indicating poor agreement and values >0.6 indicating substantial agreement [16].

Diagnosis	Pre-consensus Cohen׳s Kappa statistic
Altered Mental Status (780.97)	0.82
Chest Pain (786.5)	0.66
Gastrointestinal Hemorrhage (578)	0.64
Heart Failure (428)	0.75
Pneumonia (486)	0.72
Syncope and Collapse (780.2)	0.72

Subject area	Medical informatics
More specific subject area	Clinical decision support, machine-learning, patient mortality
Type of data	Tables, spreadsheets
How data was acquired	Stanford University Medical Center Clinical Data Warehouse (Epic) electronic health record
Data format	Analyzed data
Experimental factors	Clinical orders with prevalence <1% among 2010–2013 patient hospitalizations were excluded from analysis.
Experimental features	Clinician cohorts were stratified based on observed vs. expected patient mortality outcomes. Association models were trained based on clinical order data[2]generated by low-mortality clinicians, high-mortality clinicians, and an unfiltered clinician crowd.
Data source location	Stanford University Medical Center, Stanford, CA, USA
Data accessibility	Tables are within this article; spreadsheets are attached as supplementary material.
Related research article	Wang JK, Hom J, Balasubramanian S, et al. An Evaluation of Clinical Order Patterns Machine-Learned From Clinician Cohorts Stratified by Patient Mortality Outcomes. Journal of Biomedical Informatics. 2018;86:109-119.

14 in total

1. Automated mapping of pharmacy orders from two electronic health record systems to RxNorm within the STRIDE clinical data warehouse.

Authors: Penni Hernandez; Tanya Podchiyska; Susan Weber; Todd Ferris; Henry Lowe
Journal: AMIA Annu Symp Proc Date: 2009-11-14

2. DYNAMICALLY EVOLVING CLINICAL PRACTICES AND IMPLICATIONS FOR PREDICTING MEDICAL DECISIONS.

Authors: Jonathan H Chen; Mary K Goldstein; Steven M Asch; Russ B Altman
Journal: Pac Symp Biocomput Date: 2016

3. An evaluation of clinical order patterns machine-learned from clinician cohorts stratified by patient mortality outcomes.

Authors: Jason K Wang; Jason Hom; Santhosh Balasubramanian; Alejandro Schuler; Nigam H Shah; Mary K Goldstein; Michael T M Baiocchi; Jonathan H Chen
Journal: J Biomed Inform Date: 2018-09-07 Impact factor: 6.317

4. Medicine's uncomfortable relationship with math: calculating positive predictive value.

Authors: Arjun K Manrai; Gaurav Bhatia; Judith Strymish; Isaac S Kohane; Sachin H Jain
Journal: JAMA Intern Med Date: 2014-06 Impact factor: 21.873

5. The measurement of observer agreement for categorical data.

Authors: J R Landis; G G Koch
Journal: Biometrics Date: 1977-03 Impact factor: 2.571

6. Decaying relevance of clinical data towards future decisions in data-driven inpatient clinical order sets.

Authors: Jonathan H Chen; Muthuraman Alagappan; Mary K Goldstein; Steven M Asch; Russ B Altman
Journal: Int J Med Inform Date: 2017-03-18 Impact factor: 4.046

7. Distribution of Problems, Medications and Lab Results in Electronic Health Records: The Pareto Principle at Work.

Authors: Adam Wright; David W Bates
Journal: Appl Clin Inform Date: 2010 Impact factor: 2.342

8. Properties of AdeABC and AdeIJK efflux systems of Acinetobacter baumannii compared with those of the AcrAB-TolC system of Escherichia coli.

Authors: Etsuko Sugawara; Hiroshi Nikaido
Journal: Antimicrob Agents Chemother Date: 2014-09-22 Impact factor: 5.191

9. Automated physician order recommendations and outcome predictions by data-mining electronic medical records.

Authors: Jonathan H Chen; Russ B Altman
Journal: AMIA Jt Summits Transl Sci Proc Date: 2014-04-07

10. Inpatient Clinical Order Patterns Machine-Learned From Teaching Versus Attending-Only Medical Services.

Authors: Jason K Wang; Alejandro Schuler; Nigam H Shah; Michael T M Baiocchi; Jonathan H Chen
Journal: AMIA Jt Summits Transl Sci Proc Date: 2018-05-18