Literature DB >> 34396056

Classification and analysis of asynchronous communication content between care team members involved in breast cancer treatment.

Bryan D Steitz¹, Lina Sulieman¹, Jeremy L Warner¹, Daniel Fabbri¹, J Thomas Brown¹, Alyssa L Davis², Kim M Unertl¹.

Abstract

OBJECTIVE: A growing research literature has highlighted the work of managing and triaging clinical messages as a major contributor to professional exhaustion and burnout. The goal of this study was to discover and quantify the distribution of message content sent among care team members treating patients with breast cancer.
MATERIALS AND METHODS: We analyzed nearly two years of communication data from the electronic health record (EHR) between care team members at Vanderbilt University Medical Center. We applied natural language processing to perform sentence-level annotation into one of five information types: clinical, medical logistics, nonmedical logistics, social, and other. We combined sentence-level annotations for each respective message. We evaluated message content by team member role and clinic activity.
RESULTS: Our dataset included 81 857 messages containing 613 877 sentences. Across all roles, 63.4% and 21.8% of messages contained logistical information and clinical information, respectively. Individuals in administrative or clinical staff roles sent 81% of all messages containing logistical information. There were 33.2% of messages sent by physicians containing clinical information-the most of any role. DISCUSSION AND
CONCLUSION: Our results demonstrate that EHR-based asynchronous communication is integral to coordinate care for patients with breast cancer. By understanding the content of messages sent by care team members, we can devise informatics initiatives to improve physicians' clerical burden and reduce unnecessary interruptions.

Entities: Chemical

Keywords: breast cancer; burnout; electronic health records; multidisciplinary communication; workflow

Year: 2021 PMID： 34396056 PMCID： PMC8358477 DOI： 10.1093/jamiaopen/ooab049

Source DB: PubMed Journal: JAMIA Open ISSN： 2574-2531

INTRODUCTION

Managing care for patients with cancer requires communication and coordination among numerous specialists and team members who are often distributed by clinic location. Electronic health record (EHR)-based asynchronous clinical messaging has emerged as a primary technology to support team-based communication. EHR-based messaging is characterized by a centralized structure, which supports messages sent by an individual to a team of providers and staff. In this format, multiple individuals in pre-determined teams, often by clinic affiliation, can review and respond to each message. In theory, a team-based approach to clinical messaging allows individuals to respond to their respective messages while ensuring that the rest of the team remains informed about the patient’s care., However, in practice, many of these messages lead to unnecessary interruptions from an individual’s current scope of patient care. A growing research literature has suggested that care team members receive an increasingly high volume of asynchronous clinical communications, which lead to professional exhaustion and burnout., However, recent studies have highlighted the wider task of managing the inbox and message triage as a particular source of work., A study by Arndt et al of family physicians found that managing the messaging inbox takes 23% of their workday. In our previous work, we evaluated the scope and volume of clinical messages sent between care team members, including the actions performed on those messages., We found that over half of the messaging actions performed by a care team treating patients with breast cancer involved only reading a message without subsequently responding to the message. Effective message triage supports the opportunity to improve message response time and reduce care team work. Asynchronous clinical communication has become integral to support effective care delivery, but a meaningful process to triage the large volume of clinical communications is necessary to improve care team workload. To date, studies to assess clinical message content have primarily been conducted using patient portal messaging.,, These studies have applied a variety of natural language processing (NLP) approaches, including regression, decision trees, random forests, and neural networks to identify the needs that patients’ communicate in their messages. A study by Sulieman et al found that in a sample of 3000 patient-generated messages, 642 contained only logistical or social information that may not require a physician to respond. Our previous work has found that patients with breast cancer themselves were involved in only 26.8% of message threads in their EHRs, suggesting that a large volume of messages cannot be classified using models previously developed with a patient communication focus. In this work, we apply NLP to the clinical communications among clinical team members providing care to patients with breast cancer at an academic medical center. We aim to discover, quantify, and describe the distribution of message content, including analysis of care team member roles and message time.

METHODS

We conducted this study at the Vanderbilt-Ingram Cancer Center at Vanderbilt University Medical Center (VUMC). VUMC is an academic medical center located in middle Tennessee and provides referral care across the southeastern United States. VUMC includes a 758-bed Vanderbilt University Hospital and receives 1.6 million annual ambulatory visits. During the period of this study, providers and staff at VUMC used an institutionally developed EHR, StarPanel, for all clinical functions—including secure messaging. The Vanderbilt University Institutional Review Board approved this study (Protocol 160843).

Study population and data sources

Our study population included any patient who had an appointment with a VUMC-affiliated medical or surgical breast oncologist between January 1, 2015, and November 1, 2017, and was the subject of at least 1 message thread., We extracted all EHR-based secure asynchronous message logs corresponding to patients in our cohort between November 1, 2016, and November 1, 2017. Message log data included a unique employee identifier, a unique patient identifier, a message thread identifier, and the timestamp of the message. We mapped each employee identifier to their job role and grouped job roles into 5 classifications: administrative staff, clinical staff, oncology providers, noncancer-specific providers, and other., Clinical staff included clinical technicians, nurses, nurse practitioners, and physician assistants who were involved in direct patient care. Staff classified as “other” included individuals such as pharmacists, pharmacy technicians, volunteers who were neither involved in clinical administrative tasks nor direct patient care. We identified provider specialty using their national provider identifier. We defined medical oncologists, surgical oncologists, plastic surgeons, and radiation oncologists as oncology providers due to the frequency with which they are involved in the treatment of patients with breast cancer.

Taxonomy

We developed a sentence-level classification scheme of care team communication as shown in Box 1. We adapted our taxonomy from the parent categories of the Taxonomy of Consumer Health Information Needs with modifications to reflect communication types between providers and staff as identified by informal interviews with VUMC Breast Center providers and staff. The taxonomy contains 4 primary categories that identify the informational purpose of each message: Clinical Information, Medical Logistics Information, Nonmedical Logistics Information, and Social Information. Content that did not fit into any of the 4 primary categories were classified as “Other.” While each message could contain multiple communication types, individual sentences were required to be labeled with a single type. Taxonomy of care team communication types Clinical Information: Information involving clinical reasoning or delivery of medical care Medical Logistics Information: Information involving the coordination or scheduling of medical care Nonmedical Logistics Information: Communications about pragmatic information that is not related to medical care (eg, location of a clinic or a copy of a medical record) Social Information: Communications related to social interactions or an interpersonal relationship that is not directly related to any of the above needs Other: Communication that does not fit into one of the above categories

Gold standard

To create a gold standard training set, we randomly selected 200 message threads from our dataset. Messages were split into sentences and subsequently deidentified using the MITRE Identification Scrubber Toolkit. Each message was independently annotated using the Taxonomy of Care Team Communication Types by two annotators familiar with clinical medicine. Annotators reviewed the messages through an electronic crowdsourcing interface where they were organized by thread and divided into sentences. Annotators labeled each sentence with a single communication type. We measured interrater reliability with Cohen’s kappa, and it was 0.38. Following independent coding, two additional annotators (BDS and KMU) manually reviewed annotations and resolved discrepancies through discussion until consensus was achieved. For each annotation that did not match from the original independent annotators, we kept the consensus annotation from discussion between the two additional annotators.

NLP approach

We built and evaluated five multiclass machine learning classifiers to identify communication types in secure messages between providers and staff. Machine learning classifiers included: (1) random forest; (2) multinomial naïve Bayes; (3) support vector machine (SVM); (4) bidirectional encoder representations for transformers (BERT); (5) clinical BERT, a BERT model that was previously trained on Medical Information Mart for Intensive Care clinical notes; and (6) SciBERT, a BERT model that was trained on a corpus of scientific literature. We used the cased models for BERT, clinical BERT, and SciBERT—each of which were accessed through the HuggingFace Transformers library. For both BERT models, classification was performed using a linear classification layer that sits atop the BERT architecture. Each classifier output a categorical classification corresponding to 1 of the 5 classifications in our taxonomy for each sentence. We identified the optimal model parameters for random forest, naïve Bayes, and SVM classifiers using grid search in sci-kit learn during the training phase for each respective model. Similarly, we tuned the BERT, Clinical BERT, and SciBERT models using the training set of our communications dataset. In tuning the BERT models, we added a fully connected layer atop the BERT architecture while freezing all other layers. We additionally trained a linear classification layer to categorize sentences into one of the five information categories. Optimal parameters for each model were subsequently applied the test dataset for final model evaluation. Features that served as inputs to our random forest, naïve Bayes, and SVM classifiers included bag of words (BoW), term frequency-inverse document frequency (TF-IDF), and a Word2Vec model pretrained on a Google News dataset. We preprocessed the messages by removing nonalphanumeric characters and excluding stop words retrieved from the Natural Language Toolkit Python package. We represented the corpus of messages as a matrix in which each sentence in a message corresponds to a row and features are represented in designated columns. We used the words’ counts in each sentence for BoW representation. For TF-IDF representation, each sentence was represented by a TF-IDF value that ranged between 0 and 1 and was calculated on the training set. We averaged each word’s vector to obtain Word2Vec representations. All classifiers were trained and tested on the gold standard corpus of 200 message threads that included 2074 sentences with 5-fold cross-validation.

Statistical analysis

We evaluated the performance of each classifier and respective feature selection method using one-versus-all area under the receiver operator curves (AUCs), micro and macro F1 scores, accuracy, precision, and recall. We chose the model with the highest F1 score to classify the sentences in our entire dataset (ie, the unannotated messages). We compared the distribution for the predicted labels in the entire dataset to the gold standard labels. We similarly combined sentences for each respective message to determine concept co-occurrence, which we visualized using an UpSet graph. We combined annotations per message by taking the union of sentences’ annotations for the respective message. We calculated descriptive statistics to evaluate message concepts relative to care team member role and activity. First, we analyzed the content and messages sent and received by care team member role. Second, we summarized the volume of each message content classification sent between roles. Finally, we compared message content by oncology provider clinic activity and by working hours. We determined clinic activity by days in which a provider had scheduled appointments or procedures. Working hours were defined as any time spent on the EHR-based secure messages between 7:00 am and 7:00 pm local time.,

RESULTS

Our gold standard set contained 200 unique message threads consisting of 2074 sentences in 766 messages—a median of 3 sentences per message. The sentence-level annotations contained 568 (27%) medical logistics, 486 (23%) social, 411 (20%) nonmedical logistics, 346 (17%) clinical information, and 263 (13%) other information. Using the gold standard, we developed, trained, and optimized 5 classification algorithms (Table 1). BERT-base yielded the highest accuracy (tied with Clinical BERT), macro F1 score, micro F1 score, and AUC with the values 0.72, 0.72, 0.7, and 0.91, respectively. Hence, we subsequently applied BERT-base to the full dataset of clinical communications sent about patients in our cohort between November 1, 2016, and November 1, 2017. The full dataset contained 613 877 sentences across 81 857 unique messages. These messages were sent by 4044 unique care team members about 3766 patients (Table 2). Across all roles, more messages contained logistical information (63.4%) than any other classification. Similarly, 30.2% of all messages sent by cancer providers included at least 1 sentence containing clinical information. We present an UpSet visualization in Figure 1 of sentence-level classification sets and their respective co-occurrence in clinical messages.

Table 1.

Classification model metrics

Classifier	Optimal parameters	Parameter range	Accuracy	Macro-precision	Macro-recall	Micro-F1	Macro-F1	AUC
Random forest (SD)	Maximum depth = 100 Maximum features = 2 Number of estimators = 50 Feature selection method = Word2Vec	1–150 in increments of 2 1–100 in increments of 1 1–200 in increments of 2 BoW, TF-IDF, Word2Vec	0.59 (0.047)	0.62 (0.053)	0.54 (0.053)	0.61 (0.047)	0.72 (0.051)	0.74 (0.029)
Naïve Bayes (SD)	Alpha = 0.5 Feature selection method = BoW	0.1–1.5 in increments of 0.1 BoW, TF-IDF, Word2Vec	0.59 (0.026)	0.68 (0.049)	0.61 (0.026)	0.65 (0.026)	0.63 (0.032)	0.78 (0.016)
Support vector machine (SD)	Penalty = 0.1 Regularization = L2 Tolerance for stopping criteria = 1.3 Feature selection method = Word2Vec	0.1–5.0 in increments of 0.1 L1, L2 0.1–2.0 in increments of 0.1 BoW, TF-IDF, Word2Vec	0.61 (0.036)	0.66 (0.044)	0.64 (0.039)	0.68 (0.036)	0.65 (0.041)	0.8 (0.023)
BERT base (SD)	Epochs = 2 Learning rate = 3e−5 Max sequence length = 128	1–10 in increments of 1 1e−5, 2e−5, 3e−5, 4e−5, 5e−5 8–256 in increments of 8	0.72 (0.023)	0.7 (0.022)	0.7 (0.019)	0.72 (0.023)	0.7 (0.023)	0.91 (0.017)
Clinical BERT (SD)	Epochs = 2 Learning rate = 3e−5 Max sequence length = 128	1–10 in increments of 1 1e−5, 2e−5, 3e−5, 4e−5, 5e−5 8–256 in increments of 8	0.72 (0.026)	0.77 (0.030)	0.65 (0.055)	0.69 (0.023)	0.64 (0.026)	0.89 (0.023)
SciBERT	Epochs = 3 Learning rate = 3e−5 Max sequence length = 128	1–10 in increments of 1 1e−5, 2e−5, 3e−5, 4e−5, 5e−5 8–256 in increments of 8	0.71 (0.030)	0.7 (0.031)	0.69 (0.032)	0.71 (0.032)	0.69 (0.022)	0.90 (0.016)

AUC: area under the receiver operator curve; BERT: bidirectional encoder representations for transformer; BoW: bag of words; SD: standard deviation; TF-IDF; term frequency-inverse document frequency.

Table 2.

Care team messaging statistics by care team member role

	Administrative staff	Clinical staff	Physician (cancer provider)	Physician (noncancer specialist)	Other	Total
Number of care team members	1214	1661	21	972	176	4044
Number of patients	3623	3675	3766	2354	2236	3766
Number of message threads	25 664	34 532	10 970	11 761	2246	51 157
Number of sent messages	48 087	65 619	15 912	16 458	2906	148 982
Clinical information (%)	5941 (12.4)	15 076 (23.0)	4802 (30.2)	5956 (36.2)	710 (24.4)	32 485 (21.8)
Medical logistics (%)	28 619 (59.5)	35 340 (53.9)	8540 (53.7)	7697 (46.8)	1597 (55.0)	81 793 (54.9)
Nonmedical logistics (%)	20 790 (43.2)	25 743 (39.2)	3724 (23.4)	3963 (24.1)	1170 (40.3)	55 390 (37.2)
Social information (%)	13 945 (29.0)	18 613 (28.4)	7815 (49.1)	5926 (36.1)	1139 (39.2)	47 438 (31.8)
Other (%)	8221 (17.1)	16 608 (25.3)	4545 (28.6)	4448 (27.0)	439 (15.1)	34 261 (23.0)
Number of received messages	32 968	50 175	11 404	12 158	1735	10 8441
Clinical information (%)	3792 (11.5)	12 504 (24.9)	4314 (37.8)	4707 (38.7)	409 (23.6)	25 726 (23.7)
Medical logistics (%)	21 155 (64.2)	27 376 (54.6)	7003 (61.4)	6701 (55.1)	966 (55.7)	63 201 (58.3)
Nonmedical logistics (%)	11 855 (36.0)	21 458 (42.8)	3633 (31.9)	4294 (35.3)	534 (30.8)	41 774 (38.5)
Social information (%)	12 160 (36.9)	15 163 (30.2)	4906 (43.0)	3738 (30.7)	691 (39.8)	36 658 (33.8)
Other (%)	6413 (19.5)	10 633 (21.2)	2558 (22.4)	2843 (23.4)	430 (24.8)	22 877 (21.1)

* Since messages can contain multiple sentences, percentages for sent and received message content will sum to greater than 100%.

Figure 1.

UpSet Visualization of Messages Grouped by Classification. The bar graph in the lower left corner depicts sentence-level distribution across each category. Each row in the dot graph represents a classification category; solid dots represent each category part of the intersecting sets. The center bar graph depicts the number of messages in each intersection. Classification model metrics Random forest (SD) Maximum depth = 100 Maximum features = 2 Number of estimators = 50 Feature selection method = Word2Vec 1–150 in increments of 2 1–100 in increments of 1 1–200 in increments of 2 BoW, TF-IDF, Word2Vec Naïve Bayes (SD) Alpha = 0.5 Feature selection method = BoW 0.1–1.5 in increments of 0.1 BoW, TF-IDF, Word2Vec Support vector machine (SD) Penalty = 0.1 Regularization = L2 Tolerance for stopping criteria = 1.3 Feature selection method = Word2Vec 0.1–5.0 in increments of 0.1 L1, L2 0.1–2.0 in increments of 0.1 BoW, TF-IDF, Word2Vec BERT base (SD) Epochs = 2 Learning rate = 3e−5 Max sequence length = 128 1–10 in increments of 1 1e−5, 2e−5, 3e−5, 4e−5, 5e−5 8–256 in increments of 8 Clinical BERT (SD) Epochs = 2 Learning rate = 3e−5 Max sequence length = 128 1–10 in increments of 1 1e−5, 2e−5, 3e−5, 4e−5, 5e−5 8–256 in increments of 8 Epochs = 3 Learning rate = 3e−5 Max sequence length = 128 1–10 in increments of 1 1e−5, 2e−5, 3e−5, 4e−5, 5e−5 8–256 in increments of 8 AUC: area under the receiver operator curve; BERT: bidirectional encoder representations for transformer; BoW: bag of words; SD: standard deviation; TF-IDF; term frequency-inverse document frequency. Care team messaging statistics by care team member role * Since messages can contain multiple sentences, percentages for sent and received message content will sum to greater than 100%. Table 3 presents the content of messages sent between care team member roles. Administrative staff sent more messages to other administrative staff (44.4%) than care team members of any other role. Similarly, clinical staff and physicians sent the most messages to other clinical staff. Clinical staff and physicians sent more medical logistics information than any other information classification, regardless of recipient role. There were 20 174 messages sent by care team members that contained only social information or information classified as “Other,” of which 16 985 ended a message thread. A total of 5784 of these messages were sent by cancer providers, representing 52.7% of the total threads in which cancer providers were involved.

Table 3.

Content of messages exchanged between care team roles

	Administrative staff	Clinical staff	Physician (cancer provider)	Physician (noncancer specialist)	Other
Administrative staff
Clinical information (%)	1214 (8.5)	2357 (20.1)	427 (16.0)	451 (23.1)	233 (14.6)
Medical logistics (%)	7913 (55.1)	6023 (51.4)	1189 (44.4)	814 (41.7)	700 (43.8)
Nonmedical logistics (%)	5359 (37.3)	3858 (32.9)	406 (15.2)	436 (22.3)	376 (23.5)
Social information (%)	4077 (28.4)	2893 (24.7)	1330 (49.7)	705 (36.1)	1023 (64.1)
Other (%)	2707 (18.9)	3505 (29.9)	914 (34.1)	525 (26.9)	439 (27.5)
Total number of messages	14 359	11 727	2677	1952	1597
Clinical staff
Clinical information (%)	755 (7.9)	3209 (18.2)	1409 (29.3)	1837 (32.3)	1262 (29.2)
Medical logistics (%)	5758 (60.4)	8382 (47.6)	2223 (46.2)	2437 (42.9)	1824 (42.2)
Nonmedical logistics (%)	2946 (30.9)	8014 (45.5)	1011 (21.0)	1146 (20.2)	1150 (26.6)
Social information (%)	3015 (31.6)	4261 (24.2)	2147 (44.6)	1699 (29.9)	2744 (63.4)
Other (%)	1798 (18.9)	4295 (24.4)	1367 (28.4)	1642 (28.9)	1212 (28.0)
Total number of messages	9535	17 625	4809	5685	4327
Physician (cancer provider)
Clinical information (%)	323 (9.5)	997 (21.0)	562 (28.0)	142 (36.1)	248 (33.7)
Medical logistics (%)	2115 (62.4)	2544 (53.7)	910 (45.3)	173 (44.0)	331 (45.0)
Nonmedical logistics (%)	871 (25.7)	1876 (39.6)	556 (27.7)	77 (19.6)	173 (23.5)
Social information (%)	1134 (33.5)	1566 (33.0)	897 (44.6)	198 (50.4)	452 (61.5)
Other (%)	691 (20.4)	1326 (28.0)	672 (33.4)	97 (24.7)	192 (26.1)
Total number of messages	3390	4741	2009	393	735
Physician (noncancer provider)
Clinical information (%)	203 (7.9)	1410 (25.1)	183 (39.6)	573 (30.6)	489 (44.8)
Medical logistics (%)	1416 (55.0)	2778 (49.5)	185 (40.0)	722 (38.6)	448 (41.0)
Nonmedical logistics (%)	1001 (38.9)	2441 (43.5)	80 (17.3)	445 (23.8)	240 (22.0)
Social information (%)	567 (22.0)	1341 (23.9)	251 (54.3)	523 (28.0)	766 (70.1)
Other (%)	497 (19.3)	1489 (26.5)	160 (34.6)	820 (43.8)	347 (31.8)
Total number of messages	2576	5614	462	1871	1092
Other
Clinical information (%)	399 (13.3)	2962 (29.0)	600 (42.1)	1076 (48.1)	209 (30.8)
Medical logistics (%)	1700 (56.5)	4973 (48.6)	640 (44.9)	964 (43.1)	311 (45.9)
Nonmedical logistics (%)	832 (27.6)	3168 (31.0)	283 (19.9)	591 (26.4)	210 (31.0)
Social information (%)	1029 (34.2)	3661 (35.8)	784 (55.1)	878 (39.3)	368 (54.3)
Other (%)	784 (26.0)	3251 (31.8)	400 (28.1)	663 (29.7)	185 (27.3)
Total number of messages	3011	10 227	1424	2236	678

Row-wise care team member roles represent the role from which a message was sent. Each column represents the role of provider who received the respective message. The heatmap visualizes the percent of each information type.

Content of messages exchanged between care team roles Row-wise care team member roles represent the role from which a message was sent. Each column represents the role of provider who received the respective message. The heatmap visualizes the percent of each information type. There were 21 providers in our network who we classified as directly related to breast cancer treatments. These providers sent 15 912 messages through 10 970 distinct threads. Table 4 presents oncology provider messaging statistics by time of day and clinic activity. Each cancer provider sent an average of 13.6 messages (standard deviation [SD] = 11.6) on days with scheduled clinic activity compared to 9.9 messages (SD = 5.9) when they did not have scheduled clinical duties. Regardless of time and clinical activity, medical logistics information was the most common type of sent information, occurring in 52.4%–55.3% of all sent messages. On days in which providers did not have clinical activity, 69.8% of messages received after hours contained clinical information.

Table 4.

Oncology provider messaging statistics by time and clinic activity

	In clinic			Not in clinic
	Working hours	After hours	Total	Working hours	After hours	Total
Number of sent messages	11 136 (93.5%)	778 (6.5%)	11 916	3633 (90.9%)	363 (9.1%)	3996
Clinical information (%)	3289 (29.5)	251 (32.3)	3540	1149 (31.6)	113 (31.1)	1262
Medical logistics (%)	6006 (53.9)	430 (55.3)	6436	1905 (52.4)	199 (54.8)	2104
Nonmedical logistics (%)	2726 (24.5)	199 (25.6)	2925	729 (20.1)	70 (19.3)	799
Social information (%)	5364 (48.2)	379 (48.7)	5743	1871 (51.5)	201 (55.4)	2072
Other (%)	3216 (28.9)	250 (32.1)	3466	985 (27.1)	94 (25.9)	1079
Number of received messages	7891 (94.4%)	471 (5.6%)	8362	2790 (91.7%)	252 (8.3%)	3042
Clinical information (%)	1167 (14.8)	97 (20.6)	3050	1088 (39.0)	176 (69.8)	1264
Medical logistics (%)	4791 (60.7)	294 (62.4)	5085	1771 (63.5)	147 (58.3)	1918
Nonmedical logistics (%)	2482 (31.5)	165 (35.0)	2647	896 (32.1)	90 (35.7)	986
Social information (%)	3398 (43.1)	195 (41.4)	3593	1193 (42.8)	120 (47.6)	1313
Other (%)	1728 (21.9)	102 (21.7)	1830	671 (24.1)	57 (22.6)	728

* Since messages can contain multiple sentences, percentages for sent and received message content will sum to greater than 100%.

Oncology provider messaging statistics by time and clinic activity * Since messages can contain multiple sentences, percentages for sent and received message content will sum to greater than 100%.

DISCUSSION

In this study, we assessed and described the content of secure asynchronous messages exchanged between providers to coordinate treatment for patients with breast cancer. We trained and applied NLP classification algorithms to discover message content sent by all care team members treating a cohort of patients over one year. There have been other studies to investigate clinical message content, but these studies have primarily focused on messages originating from patients through the patient portal.,, These studies have applied both manual, and automated classification techniques.,,, A study by North et al used manual review to identify that 3.5% of patient portal messages contained urgent, high-risk, clinical needs. Another study by Cronin et al compared NLP approaches to apply the taxonomy of consumer health information needs to patient portal messages. They found that 72.3% and 24.8% of studied patient portal messages contained medical information and logistical information, respectively. However, in our previous work, we found that patients are involved in only 26.8% of message threads. To the best of our knowledge, this is one of the first studies to automatically classify the content of secure EHR-based clinical messages sent between care team members, across all care team roles. Our analysis was supported by NLP-based classification methods, which we trained using a gold standard set of messages. We compared multiple classification models and feature types. The best classifier had high predictive ability and was able to determine which categories of information were present in a sentence. We found that messaging was a primary work product of breast cancer care coordination, such that care team members performed messaging actions in 37.5% of all EHR sessions, averaging 29.8 messaging sessions per day. Automated classification of asynchronous messages may aid in informatics initiatives to reduce messaging load, such as through message triage or by identifying nonurgent messages that do not require immediate notification. Our NLP approach, however, was subject to several limitations. First, our classifiers were trained on a limited gold standard set of 200 message threads containing 766 unique messages. Previous work suggests that our classification performance may increase with a larger gold standard corpus., However, we were able to improve our classification performance using BERT for transfer learning, which reflects findings from previous studies. The gold standard corpus from which we trained and tuned the NLP models had a relatively low Cohen’s kappa score. However, during a manual review by the independent adjudicators, we noted that many of the discrepancies were related to slight differences in text selection approaches. We hypothesize that annotations differed, in part, due to the differing degrees of clinical experience between reviewers. Nonetheless, we conducted a second manual review with 2 experienced researchers to discuss discrepancies and determine consensus annotations. Interestingly, we saw decreased performance from the original BERT model when we applied pretrained BERT models trained on clinical notes and scientific text., We noted a similar decrease in performance during our preliminary work comparing a Word2Vec model trained on Google news and a Word2Vec trained on PubMed articles and clinical notes. We hypothesize that clinical notes and scientific text contain a larger degree of clinical detail and jargon, which is reflected in our results suggesting that 47% of the sentences contained logistical information, compared to only 17% that contained clinical information. Unlike the ClinicalBERT and BERT-base models which utilize the original vocabulary built on nondomain-specific text, SciBERT incorporated a scientific domain-specific vocabulary, which we hypothesize could also affect performance due to the lack of clinical information contained within our message corpus. Our features did not account for grammar or other sentence-level semantics; it is unclear whether performance would be improved with the addition of these higher-level features. Additionally, we train, tested, and applied our NLP algorithms at the sentence-level of each message. As a result, it is likely that many sentences in the same message were split between the training and test datasets, making it possible to memorize features about the overall message resulting in an overestimate of model performance. However, we hypothesize that there is minimal semantic dependence between sentences, which we will test in future work. Additionally, our cross-validation included only training and testing datasets without an additional validation dataset. We made this decision to maximize the amount of data available for model development. We note the lack of a separate validation set to measure model generalizability as a limitation to our approach. Future work will aim to develop a larger corpus gold standard messages on which to apply our classification algorithms. Using a larger gold standard corpus will allow us to explore more granular information types such that we can further understand message content with the goal of improving message triage tasks. We focused our analysis on patients who had at least one appointment with a breast medical or surgical oncologist at our institution. We chose this patient population such that we could understand the full scope of message content sent by a care team treating patients with breast cancer over a one-year period. However, previous studies have found that patients with breast cancer receive care from multiple healthcare institutions. Inter-institution collaborations are not often supported by EHR-based messaging and require other means of communication. Many care coordination activities occur through synchronous communication (eg, phone calls, in-person conversations).,, As a result, our findings cannot capture all communication among care team members treating patients with breast cancer, or across all organizations involved in their care. Similarly, we also do not account for other forms of synchronous and asynchronous means to support provider communication within our institution (eg, email). However, during our study period at VUMC, EHR-based asynchronous communication was the preferred means of communication as a way to document conversation among care team members. Understanding the information discussed in clinical messages is a critical first step to recognizing opportunities to reduce messaging workload. Across all team member roles, we found that 63.4% of messages discussed logistical information. Similarly, all roles sent and received more medical logistics information than any other information type. These results suggest that EHR-based asynchronous clinical communication is highly important in coordinating care, although it is not clear if it is the most efficient or effective approach to care coordination or if the best-qualified people are being asked to deal with these messages. We also found that the 81% and 80% of all logistical information were sent and received by administrative and clinical staff, respectively. This indicates the importance of staff in these roles to coordinate care, which reflects results from our previous work and the importance of including all care team members when evaluating care coordination analyses. Numerous previous studies have related clerical and administrative work, such as responding to messages, as a major factor in physician burnout., We hypothesize that systematically classifying messages to identify messages that can be answered by other care team members can help to triage messages and reduce physicians’ messaging workload. We also found that physicians send and receive more messages containing clinical information than team members of any other role. Nonetheless, these communications accounted for less than 40% of all messages. We hypothesize that providers utilize other forms of communication to communicate more urgent needs. Our results indicate that 11% of sentences were classified as “Other.” In our manual review of messages, we found that the majority of these sentences contained an acknowledgment of a previous message. Similarly, we found that there were 16 985 messages that contained only sentences classified as social or “Other” information that ended a message thread. There were 5784 of these messages that were sent by cancer providers, representing 52.7% of the total threads in which these providers were involved. These results suggest that there is an opportunity to support functionality that can predict the end of a message thread and marking the thread as resolved. Future work could seek to develop algorithms to automatically detect these completed threads without requiring unnecessary messaging actions and responses. Numerous studies have suggested that work outside of normal working hours and on days without clinic responsibility leads to professional burnout.,, In our analysis of cancer provider messaging by clinic activity and time, we found that there continued to be a large amount of messaging activity performed outside of direct clinical responsibility. We found that despite clinical activity and time of day, logistical information persists as the most common type of information. However, our results indicate that nearly 70% of received messages after hours when cancer providers did not have scheduled clinical activity contained clinical information. Nonetheless, only 31% of the sent messages contained clinical information. We hypothesize that cancer providers triage these messages based on urgency. Future work should seek to develop algorithms to predict message urgency, which could reduce unnecessary notifications for nonurgent messages.

CONCLUSIONS

Our study demonstrates that EHR-based asynchronous communications are integral to coordinating the care of patients with breast cancer. This study is one of the first to apply NLP to classify the content of messages sent between care team members. Understanding the content of messages sent by care team members affords the opportunity to devise informatics initiatives to improve physicians’ clerical burden and reduce unnecessary interruptions.

FUNDING

BDS and JTB were supported by the 4T15LM007450 training grant from the United States National Library of Medicine.

AUTHOR CONTRIBUTIONS

BDS, LS, and KMU were involved in the conception of the work, data analysis and interpretation, drafting of the article, and revision of the article. JLW and DF provided guidance in data analysis and interpretation and participated in drafting the article. JTB and ALD were involved in data collection and participated in drafting the article.

36 in total

1. Physician stress and burnout: the impact of health information technology.

Authors: Rebekah L Gardner; Emily Cooper; Jacqueline Haskell; Daniel A Harris; Sara Poplau; Philip J Kroth; Mark Linzer
Journal: J Am Med Inform Assoc Date: 2019-02-01 Impact factor: 4.497

2. A Social Network Analysis of Cancer Provider Collaboration.

Authors: Bryan D Steitz; Mia A Levy
Journal: AMIA Annu Symp Proc Date: 2017-02-10

3. Tethered to the EHR: Primary Care Physician Workload Assessment Using EHR Event Log Data and Time-Motion Observations.

Authors: Brian G Arndt; John W Beasley; Michelle D Watkinson; Jonathan L Temte; Wen-Jan Tuan; Christine A Sinsky; Valerie J Gilchrist
Journal: Ann Fam Med Date: 2017-09 Impact factor: 5.166

4. Electronic Health Record Alert-Related Workload as a Predictor of Burnout in Primary Care Providers.

Authors: Megan E Gregory; Elise Russo; Hardeep Singh
Journal: Appl Clin Inform Date: 2017-07-05 Impact factor: 2.342

5. Electronic health records and burnout: Time spent on the electronic health record after hours and message volume associated with exhaustion but not with cynicism among primary care clinicians.

Authors: Julia Adler-Milstein; Wendi Zhao; Rachel Willard-Grace; Margae Knox; Kevin Grumbach
Journal: J Am Med Inform Assoc Date: 2020-04-01 Impact factor: 4.497

Review 6. Cancer Care Coordination: a Systematic Review and Meta-Analysis of Over 30 Years of Empirical Studies.

Authors: Sherri Sheinfeld Gorin; David Haggstrom; Paul K J Han; Kathleen M Fairfield; Paul Krebs; Steven B Clauser
Journal: Ann Behav Med Date: 2017-08

7. Secondary use of clinical data: the Vanderbilt approach.

Authors: Ioana Danciu; James D Cowan; Melissa Basford; Xiaoming Wang; Alexander Saip; Susan Osgood; Jana Shirey-Rice; Jacqueline Kirby; Paul A Harris
Journal: J Biomed Inform Date: 2014-02-14 Impact factor: 6.317

8. Evaluating the Scope of Clinical Electronic Messaging to Coordinate Care in a Breast Cancer Cohort.

Authors: Bryan D Steitz; Mia A Levy
Journal: Stud Health Technol Inform Date: 2019-08-21

9. A Crowdsourcing Framework for Medical Data Sets.

Authors: Cheng Ye; Joseph Coco; Anna Epishova; Chen Hajaj; Henry Bogardus; Laurie Novak; Joshua Denny; Yevgeniy Vorobeychik; Thomas Lasko; Bradley Malin; Daniel Fabbri
Journal: AMIA Jt Summits Transl Sci Proc Date: 2018-05-18

10. Automating the Classification of Complexity of Medical Decision-Making in Patient-Provider Messaging in a Patient Portal.

Authors: Lina Sulieman; Jamie R Robinson; Gretchen P Jackson
Journal: J Surg Res Date: 2020-06-19 Impact factor: 2.192

1 in total

1. Managing diabetes during treatment for breast cancer: oncology and primary care providers' views on barriers and facilitators.

Authors: Laura C Pinheiro; Jacklyn Cho; Lisa M Kern; Noel Higgason; Ronan O'Beirne; Rulla Tamimi; Monika Safford
Journal: Support Care Cancer Date: 2022-05-11 Impact factor: 3.359

1 in total