Literature DB >> 35275070

Cluster Analysis of Primary Care Physician Phenotypes for Electronic Health Record Use: Retrospective Cohort Study.

Allan Fong¹, Mark Iscoe², Christine A Sinsky³, Adrian D Haimovich², Brian Williams⁴, Ryan T O'Connell⁴, Richard Goldstein⁴, Edward Melnick².

Abstract

BACKGROUND: Electronic health records (EHRs) have become ubiquitous in US office-based physician practices. However, the different ways in which users engage with EHRs remain poorly characterized.
OBJECTIVE: The aim of this study is to explore EHR use phenotypes among ambulatory care physicians.
METHODS: In this retrospective cohort analysis, we applied affinity propagation, an unsupervised clustering machine learning technique, to identify EHR user types among primary care physicians.
RESULTS: We identified 4 distinct phenotype clusters generalized across internal medicine, family medicine, and pediatrics specialties. Total EHR use varied for physicians in 2 clusters with above-average ratios of work outside of scheduled hours. This finding suggested that one cluster of physicians may have worked outside of scheduled hours out of necessity, whereas the other preferred ad hoc work hours. The two remaining clusters represented physicians with below-average EHR time and physicians who spend the largest proportion of their EHR time on documentation.
CONCLUSIONS: These findings demonstrate the utility of cluster analysis for exploring EHR use phenotypes and may offer opportunities for interventions to improve interface design to better support users' needs. ©Allan Fong, Mark Iscoe, Christine A Sinsky, Adrian D Haimovich, Brian Williams, Ryan T O'Connell, Richard Goldstein, Edward Melnick. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 15.04.2022.

Entities: Chemical

Keywords: EHR; cluster analysis; electronic health record; machine learning; phenotypes; primary care; unsupervised machine learning

Year: 2022 PMID： 35275070 PMCID： PMC9055474 DOI： 10.2196/34954

Source DB: PubMed Journal: JMIR Med Inform

Introduction

As of 2021, the vast majority of US office-based physicians used an electronic health record (EHR) [1]. The transition from paper to electronic records has many potential benefits but has also introduced new burdens. Furthermore, EHR use dominates clinical time [2] and is associated with burnout [3-5]. Despite the ubiquity of EHRs, patterns of clinician use are poorly characterized. A 2019 survey study of clinicians reported widely divergent, subjective experiences with their EHR use and found that individual user differences accounted for over half of the variation in EHR use [6]. User-level variation can be due to disparities in proficiency that could potentially be remedied with appropriate training [7-10]. Emerging evidence suggests there are elements aside from proficiency that differentiate EHR users. For example, recent cross-sectional analyses of ambulatory care physicians’ EHR use have found significant differences in time spent on EHRs based on gender [11,12], specialty [12,13], and country [14]. Audit logs offer a wealth of information derived from granular observations of users’ EHR actions [15,16]. For example, research using log data has demonstrated associations between physicians’ EHR activities and vendor-defined metrics of efficiency [17] and that efficiency varied based on physicians’ years of experience and shift type [18]. In this study, we propose to use audit log data for the de novo identification of EHR user types (ie, EHR use phenotypes). Phenotype was first introduced by Richesson et al [19] as a biological concept to describe a set of observable biological traits. In the context of EHR use measures, phenotype will be used to describe observable use patterns across gender and specialty differences as defined by an unsupervised clustering approach called affinity propagation. First, 5 EHR use measures will be standardized using z-scores, which will then be used to calculate the similarities between physicians. A grid search and algorithm constraints will then be used to identify optimal clusters across a cohort of ambulatory care physicians.

Methods

Study Setting and Data Sources

This study retrospectively examined EHR log data of nontrainee, primary care physicians employed by a large ambulatory practice network (Northeast Medical Group) in northeastern United States (Connecticut, New York, and Rhode Island) between March 2018 and February 2020. Physicians were included if they specialized in general internal medicine, family medicine, or general pediatrics.

Ethics Approval

All data were anonymized, with the investigators blinded to the participants’ identities. The study protocol was approved by Northeast Medical Group’s Institutional Review Board (IRB number 2000026556).

EHR Use Measures

We retrieved data from the Epic Signal platform (Epic Systems) stratified by month and derived 5 proposed, time-based core EHR use measures normalized to 8 hours of scheduled patient time (Table 1) [20]. The first measure is EHR-Time8, defined as the time a physician spends on EHRs (both during and outside of scheduled patient hours) [20]. The second measure is work outside of work (WOW8), not to be confused with WOW carts (ie, workstations on wheels, a common industry term). WOW8 is defined as the time a physician works on EHRs outside of scheduled patient hours [20]. The third measure is Note-Time8, defined as the time a physician spends on documentation [20]. The fourth and fifth measures are IB-Time8 and Order-Time8, defined as the times a physician spends on inbox activities and on orders, respectively [20]. To account for relationships between EHR-Time8 and its composite measures, we reported the ratios of WOW8, Note-Time8, IB-Time8, and Order-Time8 to EHR-Time8, denoted as WOW-EHR, Note-EHR, IB-EHR, and Order-EHR, respectively. These measures (Table 1) were calculated and extracted from the Epic Signal platform, which have been validated and used in previous studies [20,21]. Each physician’s EHR use measures were averaged across study months to account for variation in metric calculations introduced by changes in measure definitions over time due to the vendor’s continuous quality improvement processes. For this analysis, we only considered physicians with valid metric months. Months with fewer than 30 clinical hours scheduled and less than 1 hour of EHR use were excluded from the analysis as invalid metric months. These thresholds were determined based on previous manual chart review validation and analysis of EHR vendor data [13].

Table 1

Electronic health record (EHR) use measures and definitions.

Measure	Definition
EHR-Time₈	Time a physician spends on EHRs (both during and outside of scheduled patient hours) normalized to 8 hours of scheduled patient time
WOW-EHR	Ratio of EHR time that occurs during work outside of work (WOW₈^a) hours: WOW₈/EHR-Time₈
Note-EHR	Ratio of EHR time a physician spends on documentation: Note-Time₈^b/EHR-Time₈
IB-EHR	Ratio of EHR time a physician spends on inbox (IB) activities: IB-Time₈^c/EHR-Time₈
Order-EHR	Ratio of EHR time a physician spends on orders: Order-Time₈^d/EHR-Time₈

aWOW8: work outside of work hours normalized to 8 hours of scheduled patient time.

bNote-Time8: note time hours normalized to 8 hours of scheduled patient time.

cIB-Time8: inbox time hours normalized to 8 hours of scheduled patient time.

dOrder-Time8: order time hours normalized to 8 hours of scheduled patient time.

Electronic health record (EHR) use measures and definitions. aWOW8: work outside of work hours normalized to 8 hours of scheduled patient time. bNote-Time8: note time hours normalized to 8 hours of scheduled patient time. cIB-Time8: inbox time hours normalized to 8 hours of scheduled patient time. dOrder-Time8: order time hours normalized to 8 hours of scheduled patient time.

Cluster Analysis

Clusters were required to include individuals from at least two primary care specialties. Moreover, we did not require that all individuals be assigned to a phenotype cluster while also seeking to minimize the total number of phenotypes. Affinity propagation, an algorithm that takes a set of pairwise similarities between data points and finds clusters on the basis of maximizing the total similarity between data points in a cluster, was used for phenotype discovery [22]. Affinity propagation has advantages over other clustering algorithms, such as not predefining a number of clusters. A major disadvantage of affinity propagation is its high computational cost and resource requirement; however, this approach was deemed feasible given this study’s sample size [22]. First, a standard z-score for each measure was calculated in order to center and scale the data. Similarities between data points were then calculated using Euclidean distance, which is defined for two 2D points as the length of the line formed by the two points. A grid search was then performed by varying the damping factor and preference from 0.5 to 1 and from 2 to 4, respectively, to identify the optimal clustering given the initial cluster conditions. Physicians in clusters that did not have representation from at least two specialties were excluded. Finally, physician gender and specialty distributions were described between clusters. All analyses were performed using Python software (version 3.7; Python Software Foundation) and scikit-learn (version 0.24; scikit-learn developers) [23].

Results

Identifying Clusters

Of 332 ambulatory, nontrainee physicians, 290 (87.3%) have valid month metrics. Of those, a further 173 (52.1%) eligible physicians were of the specialties of interest: 117 (67.6%) in internal medicine, 36 (20.8%) in family medicine, and 20 (11.6%) in pediatrics. Gender distribution of the eligible physicians was 47.4% (82/173) female and 52.6% (91/173) male. We identified 4 clusters that met our a priori defined clustering conditions, accounting for 97.7% (169/173) of eligible physicians (Figure 1).

Figure 1

Summary of workflow and exclusion criteria.

EHR Use Measures and Phenotypes Clusters

The phenotype clusters are “Lower EHR time,” “Higher note time,” “Work outside of work,” and “Notes outside of work.” The EHR use measures across clusters are summarized in Table 2. There was a significant association between phenotype clusters and each EHR use measure: EHR-Time8 (Kruskal-Wallis H=72.7, P<.001), WOW-EHR (H=84.3, P<.001), Note-EHR (H=89.0, P<.001), IB-EHR (H=45.8, P<.001), and Order-EHR (H=46.8, P<.001). The z-scores for the measures are displayed in Figure 2 to illustrate the relative differences between clusters.

Table 2

Electronic health record (EHR) use measures by phenotype cluster.

Measure	Phenotype clusters, median (IQR)
Measure	Lower EHR time	Higher note time	Work outside of work	Notes outside of work	All
EHR-Time₈^a	4.62 (4.20-5.43)	5.81 (4.41-6.22)	6.83 (5.95-8.36)	5.90 (5.37-6.36)	5.62 (4.57-6.40)
WOW-EHR^b	0.07 (0.04-0.12)	0.05 (0.03-0.07)	0.21 (0.17-0.26)	0.13 (0.10-0.19)	0.11 (0.06-0.19)
Note-EHR^c	0.24 (0.20-0.28)	0.46 (0.43-0.49)	0.31 (0.27-0.36)	0.37 (0.33-0.40)	0.29 (0.24-0.38)
IB-EHR^d	0.14 (0.12-0.18)	0.06 (0.05-0.08)	0.15 (0.11-0.17)	0.10 (0.08-0.12)	0.13 (0.09-0.16)
Order-EHR^e	0.19 (0.17-0.24)	0.14 (0.12-0.17)	0.16 (0.14-0.18)	0.14 (0.12-0.17)	0.17 (0.14-0.20)

aEHR-Time8: time a physician spends on EHRs normalized to 8 hours of scheduled patient time.

bWOW-EHR: ratio of EHR time that occurs during work outside of scheduled hours.

cNote-EHR: ratio of EHR time that a physician spends on documentation.

dIB-EHR: ratio of EHR time that a physician spends on inbox activities.

eOrder-EHR: ratio of EHR time that a physician spends on orders.

Figure 2

Z-scores for electronic health record (EHR) use measure across clusters. EHR-Time8: time a physician spends on EHRs normalized to 8 hours of scheduled patient time; IB-EHR: ratio of EHR time that a physician spends on inbox activities; Note-EHR: ratio of EHR time that a physician spends on documentation; Order-EHR: ratio of EHR time that a physician spends on orders; WOW-EHR: ratio of EHR time that occurs during work outside of scheduled hours.

Electronic health record (EHR) use measures by phenotype cluster. aEHR-Time8: time a physician spends on EHRs normalized to 8 hours of scheduled patient time. bWOW-EHR: ratio of EHR time that occurs during work outside of scheduled hours. cNote-EHR: ratio of EHR time that a physician spends on documentation. dIB-EHR: ratio of EHR time that a physician spends on inbox activities. eOrder-EHR: ratio of EHR time that a physician spends on orders. Z-scores for electronic health record (EHR) use measure across clusters. EHR-Time8: time a physician spends on EHRs normalized to 8 hours of scheduled patient time; IB-EHR: ratio of EHR time that a physician spends on inbox activities; Note-EHR: ratio of EHR time that a physician spends on documentation; Order-EHR: ratio of EHR time that a physician spends on orders; WOW-EHR: ratio of EHR time that occurs during work outside of scheduled hours.

“Lower EHR Time” Cluster

The “Lower EHR time” cluster was the largest cluster, constituting 42.2% (73/173) of eligible physicians. Physicians in this cluster spent the least amount of time on EHRs (EHR-Time8: median 4.62, IQR 4.20-5.43). “Lower EHR time” cluster physicians had the lowest median Note-EHR ratio of 0.24 (IQR 0.20-0.28) and the second lowest median WOW-EHR ratio of 0.07 (IQR 0.04-0.12). They also had the highest median IB-EHR and Order-EHR ratios of 0.14 (IQR 0.12-0.18) and 0.19 (IQR 0.17-0.24), respectively.

“Higher Note Time” Cluster

“Higher note time” cluster physicians, constituting only 8.7% (15/173) of the total, had near-average normalized EHR time (EHR-Time8: median 5.81, IQR 4.41-6.22). Physicians in this cluster spent the largest proportion of their EHR time documenting notes (Note-Time: median 0.46, IQR 0.43-0.49) compared to physicians in other clusters. They also spent the lowest proportions of that time on EHRs outside of scheduled hours and on inbox activities, with median WOW-EHR and IB-EHR ratios of 0.05 (IQR 0.03-0.07) and 0.06 (IQR 0.05-0.08), respectively.

“Work Outside of Work” Cluster

“Work outside of work” cluster physicians, constituting 27.2% (47/173) of the total, spent the most time on EHRs (EHR-Time8: median 6.83, 5.95-8.36) and the largest proportion of that time outside of work hours (WOW-EHR: median 0.21, IQR 0.17-0.26). This cluster of physicians had average median Note-EHR and Order-EHR ratios of 0.31 (0.27-0.36) and 0.16 (IQR 0.14-0.18), respectively, and an above-average median IB-EHR ratio of 0.15 (IQR 0.11-0.17).

“Notes Outside of Work” Cluster

“Notes outside of work” cluster physicians, constituting 19.7% (34/173) of the total, had the second-highest median WOW-EHR ratio of 0.13 (IQR 0.10-0.19) but had near-average total normalized EHR time (EHR-Time8: median 5.90, IQR 5.37-6.36). This cluster of physicians had an above-average median Note-EHR ratio of 0.37 (IQR 0.33-0.40) and below-average median IB-EHR and Order-EHR ratios of 0.10 (IQR 0.08-0.12) and 0.14 (IQR 0.12-0.17), respectively.

Phenotype Clusters by Specialty and Gender

Physician distribution across phenotype clusters by specialty and gender are reported in Table 3. There was a significant association between the clusters and specialty (X26=26.67, P<.001). Pediatricians primarily fell into the “Higher note time” and “Notes outside of work” clusters (16/20, 80%) and accounted for 47% (7/15) of the total physicians in the “Higher note time” cluster. Family and internal medicine physicians were primarily distributed across the “Lower EHR time” and “Work outside of work” clusters (family medicine: 29/36, 81%; internal medicine: 87/113, 77%). In addition, there was a significant association between gender and clusters (X23=18.28, P<.001). Female physicians were more prominent in the “Work outside of work” and “Notes outside of work” clusters, accounting for 64% (30/47) and 62% (21/34) of the clusters, respectively. Male physicians accounted for 71% (52/73) of the “Lower EHR time” cluster.

Table 3

Physician specialty and gender distribution by phenotype cluster.

Distribution		Number of physicians (N=173), n (%)	Phenotype clusters				P value
			Lower EHR^a time (n=73), n (%)	Higher note time (n=15), n (%)	Work outside of work (n=47), n (%)	Notes outside of work (n=34), n (%)
Specialty								<.001
	Family medicine	36 (21)	19 (26)	2 (13)	10 (21)	5 (15)
	Internal medicine	113 (65)	52 (71)	6 (40)	35 (74)	20 (59)
	Pediatrics	20 (12)	2 (3)	7 (47)	2 (4)	9 (26)
Gender								<.001
	Female	80 (46)	21 (29)	8 (53)	30 (64)	21 (62)
	Male	89 (51)	52 (71)	7 (47)	17 (36)	13 (38)
Total		169 (98)	73 (42)	15 (9)	47 (27)	34 (20)

aEHR: electronic health record.

Physician specialty and gender distribution by phenotype cluster. aEHR: electronic health record.

Discussion

Principal Findings

In this unsupervised clustering machine learning analysis of a cohort of primary care physicians, we identified 4 distinct EHR use phenotypes characterized by the total time spent on EHR activities and the ratios of those times in comparison to one another. These phenotypes were differentiated and described by patterns of use consistent with overall efficiency, higher documentation time, and working outside of work hours; each of these patterns of use were generally associated with the “Lower EHR time,” “Higher note time,” and “Work/Notes outside of work” clusters, respectively. While exploratory, these results provide insights into EHR use phenotypes across gender and specialties that can complement and provide additional context for current EHR use research.

Work Outside of Scheduled Hours

We identified 2 phenotype clusters that had above-average ratios for work outside of scheduled hours. Although “Work outside of work” and “Notes outside of work” clusters both had high WOW-EHR ratios, only the “Work outside of work” cluster had significantly higher than average EHR-Time8. A possible explanation for this is that physicians in the “Work outside of work” cluster work from home partly out of necessity because they require more time on EHRs, whereas physicians in the “Notes outside of work” cluster may elect to finish work at home, suggesting a preference for ad hoc work hours.

Note Time

Time spent on clinical documentation accounted for the largest proportion of total EHR time in each cluster. There was, however, considerable variation in the ratio of note time to EHR time across clusters: from 0.24 of EHR time in the “Lower EHR time” cluster to 0.46 in the “Higher note time” cluster despite similar total EHR time in both clusters. Potential explanations for this variation include differences in clinic- or physician-specific workflows (eg, scribe support or team-based documentation; differences in depth and complexity of encounters and expectations for documentation; and use of form, copied, or auto-populated notes) and differences in documentation style, particularly among the “Higher note time” cluster that may include physicians who deliberately spend more time on documentation.

Limitations

This exploratory work only used time-based metrics and did not account for patient acuity or complexity. Although the data were gathered over a 2-year period, systemic differences in patient volume and care could have affected the results. In addition, this work was limited to a single ambulatory practice network in one region of the United States and was limited to primary care physicians. Some types of EHR activities (eg, chart review) were not included in the metrics, and it is possible that other activities or practice domains could also affect clustering. Furthermore, it should be noted that this study only identified EHR use phenotypes and did not explore reasons behind differences in EHR use or assign value to the phenotypes.

Conclusions

Our findings may highlight opportunities for interventions to improve EHR design and use to better support EHR users’ needs. Potential differences in users’ needs were identified for each phenotype cluster. The “Higher note time” and “Notes outside of work” clusters might benefit from scribe support more than the other two clusters. The “Work outside of work” cluster might benefit from inbox support and restructuring their practice for a more team-based approach. Physicians in the “Lower EHR time” cluster could be consulted as local champions to help their peers improve their EHR efficiency. By identifying and classifying individual EHR use and user needs, we can better understand and target interventions at the individual or department level. Future work should validate these phenotypes in larger cohorts and in diverse settings, explore differences in physicians’ training and demographics across phenotypes, and investigate the relationships among EHR use phenotypes, patient outcomes, and clinician satisfaction and burnout.

21 in total

1. Clustering by passing messages between data points.

Authors: Brendan J Frey; Delbert Dueck
Journal: Science Date: 2007-01-11 Impact factor: 47.728

2. Cross-sectional survey of workplace stressors associated with physician burnout measured by the Mini-Z and the Maslach Burnout Inventory.

Authors: Kristine Olson; Christine Sinsky; Seppo T Rinne; Theodore Long; Ronald Vender; Sandip Mukherjee; Michael Bennick; Mark Linzer
Journal: Stress Health Date: 2019-01-21 Impact factor: 3.519

3. Tethered to the EHR: Primary Care Physician Workload Assessment Using EHR Event Log Data and Time-Motion Observations.

Authors: Brian G Arndt; John W Beasley; Michelle D Watkinson; Jonathan L Temte; Wen-Jan Tuan; Christine A Sinsky; Valerie J Gilchrist
Journal: Ann Fam Med Date: 2017-09 Impact factor: 5.166

4. Assessment of Electronic Health Record Use Between US and Non-US Health Systems.

Authors: A Jay Holmgren; N Lance Downing; David W Bates; Tait D Shanafelt; Arnold Milstein; Christopher D Sharp; David M Cutler; Robert S Huckman; Kevin A Schulman
Journal: JAMA Intern Med Date: 2021-02-01 Impact factor: 21.873

5. Electronic Health Record Use by Sex Among Physicians in an Academic Health Care System.

Authors: Sarah D Tait; Sachiko M Oshima; Yi Ren; Alexander E Fenn; Mina Boazak; Eugenia McPeek Hinz; E Shelley Hwang
Journal: JAMA Intern Med Date: 2021-02-01 Impact factor: 21.873

6. Novel electronic health record (EHR) education intervention in large healthcare organization improves quality, efficiency, time, and impact on burnout.

Authors: Kenneth E Robinson; Joyce A Kersey
Journal: Medicine (Baltimore) Date: 2018-09 Impact factor: 1.817

7. Characterizing electronic health record usage patterns of inpatient medicine residents using event log data.

Authors: Jason K Wang; David Ouyang; Jason Hom; Jeffrey Chi; Jonathan H Chen
Journal: PLoS One Date: 2019-02-06 Impact factor: 3.240

8. Metrics for assessing physician activity using electronic health record log data.

Authors: Christine A Sinsky; Adam Rule; Genna Cohen; Brian G Arndt; Tait D Shanafelt; Christopher D Sharp; Sally L Baxter; Ming Tai-Seale; Sherry Yan; You Chen; Julia Adler-Milstein; Michelle Hribar
Journal: J Am Med Inform Assoc Date: 2020-04-01 Impact factor: 4.497

9. Allocation of Physician Time in Ambulatory Practice: A Time and Motion Study in 4 Specialties.

Authors: Christine Sinsky; Lacey Colligan; Ling Li; Mirela Prgomet; Sam Reynolds; Lindsey Goeders; Johanna Westbrook; Michael Tutty; George Blike
Journal: Ann Intern Med Date: 2016-09-06 Impact factor: 25.391

10. Characterizing physician EHR use with vendor derived data: a feasibility study and cross-sectional analysis.

Authors: Edward R Melnick; Shawn Y Ong; Allan Fong; Vimig Socrates; Raj M Ratwani; Bidisha Nath; Michael Simonov; Anup Salgia; Brian Williams; Daniel Marchalik; Richard Goldstein; Christine A Sinsky
Journal: J Am Med Inform Assoc Date: 2021-07-14 Impact factor: 4.497