Lina Sulieman1, David Gilmore2, Christi French2, Robert M Cronin3, Gretchen Purcell Jackson4, Matthew Russell2, Daniel Fabbri5. 1. Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA. Electronic address: lina.m.sulieman@vanderbilt.edu. 2. Digital Reasoning Systems, Inc., Nashville, TN, USA. 3. Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Internal Medicine, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA. 4. Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Pediatric Surgery, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA. 5. Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Computer Science and Engineering, Vanderbilt University, Nashville, TN, USA.
Abstract
OBJECTIVE: Patients communicate with healthcare providers via secure messaging in patient portals. As patient portal adoption increases, growing messaging volumes may overwhelm providers. Prior research has demonstrated promise in automating classification of patient portal messages into communication types to support message triage or answering. This paper examines if using semantic features and word context improves portal message classification. MATERIALS AND METHODS: Portal messages were classified into the following categories: informational, medical, social, and logistical. We constructed features from portal messages including bag of words, bag of phrases, graph representations, and word embeddings. We trained one-versus-all random forest and logistic regression classifiers, and convolutional neural network (CNN) with a softmax output. We evaluated each classifier's performance using Area Under the Curve (AUC). RESULTS: Representing the messages using bag of words, the random forest detected informational, medical, social, and logistical communications in patient portal messages with AUCs: 0.803, 0.884, 0.828, and 0.928, respectively. Graph representations of messages outperformed simpler features with AUCs: 0.837, 0.914, 0.846, 0.884 for informational, medical, social, and logistical communication, respectively. Representing words with Word2Vec embeddings, and mapping features using a CNN had the best performance with AUCs: 0.908 for informational, 0.917 for medical, 0.935 for social, and 0.943 for logistical categories. DISCUSSION AND CONCLUSION: Word2Vec and graph representations improved the accuracy of classifying portal messages compared to features that lacked semantic information such as bag of words, and bag of phrases. Furthermore, using Word2Vec along with a CNN model, which provide a higher order representation, improved the classification of portal messages.
OBJECTIVE:Patients communicate with healthcare providers via secure messaging in patient portals. As patient portal adoption increases, growing messaging volumes may overwhelm providers. Prior research has demonstrated promise in automating classification of patient portal messages into communication types to support message triage or answering. This paper examines if using semantic features and word context improves portal message classification. MATERIALS AND METHODS: Portal messages were classified into the following categories: informational, medical, social, and logistical. We constructed features from portal messages including bag of words, bag of phrases, graph representations, and word embeddings. We trained one-versus-all random forest and logistic regression classifiers, and convolutional neural network (CNN) with a softmax output. We evaluated each classifier's performance using Area Under the Curve (AUC). RESULTS: Representing the messages using bag of words, the random forest detected informational, medical, social, and logistical communications in patient portal messages with AUCs: 0.803, 0.884, 0.828, and 0.928, respectively. Graph representations of messages outperformed simpler features with AUCs: 0.837, 0.914, 0.846, 0.884 for informational, medical, social, and logistical communication, respectively. Representing words with Word2Vec embeddings, and mapping features using a CNN had the best performance with AUCs: 0.908 for informational, 0.917 for medical, 0.935 for social, and 0.943 for logistical categories. DISCUSSION AND CONCLUSION: Word2Vec and graph representations improved the accuracy of classifying portal messages compared to features that lacked semantic information such as bag of words, and bag of phrases. Furthermore, using Word2Vec along with a CNN model, which provide a higher order representation, improved the classification of portal messages.
Authors: Jinying Chen; John Lalor; Weisong Liu; Emily Druhl; Edgard Granillo; Varsha G Vimalananda; Hong Yu Journal: J Med Internet Res Date: 2019-03-11 Impact factor: 5.428
Authors: Juan Antonio Lossio-Ventura; Sergio Gonzales; Juandiego Morzan; Hugo Alatrista-Salas; Tina Hernandez-Boussard; Jiang Bian Journal: Artif Intell Med Date: 2021-05-07 Impact factor: 7.011