| Literature DB >> 35767327 |
Anahita Davoudi1, Natalie S Lee2, Krisda Chaiyachati3,4,5, Danielle Mowery1, ThaiBinh Luong6, Timothy Delaney3, Elizabeth Asch4.
Abstract
BACKGROUND: Free-text communication between patients and providers plays an increasing role in chronic disease management, through platforms varying from traditional health care portals to novel mobile messaging apps. These text data are rich resources for clinical purposes, but their sheer volume render them difficult to manage. Even automated approaches, such as natural language processing, require labor-intensive manual classification for developing training data sets. Automated approaches to organizing free-text data are necessary to facilitate use of free-text communication for clinical care.Entities:
Keywords: chatbots; latent Dirichlet allocation; natural language processing; secure messaging systems; unsupervised learning
Mesh:
Year: 2022 PMID: 35767327 PMCID: PMC9280462 DOI: 10.2196/36151
Source DB: PubMed Journal: J Med Internet Res ISSN: 1438-8871 Impact factor: 7.076
Figure 1Study workflow. LDA: latent Dirichlet allocation; med: medication; medLDA: medication-specific LDA model; msg: messages.
Examples of lexical and semantic features.
| Package type and category | Examplea | ||
|
| |||
|
| Problem | “Musinex, proair inhaler and delsym for | |
|
| Test | “Took meds between 8:30 and 9:00 am after my | |
|
| Treatment | “ | |
|
| |||
|
| Drug product name (DPN) | “Sorry Lisinopril was stopped | |
|
| Drug ingredient (DIN) | “Sorry | |
|
| Drug brand name (DBN) | “Hi have only 11 tablets of | |
|
| Drug dose form (DDF) | “Dr doubled my Lisinopril and removed the water | |
|
| Dose (DOSE) | “Sorry Lisinopril was stopped HCTZ | |
|
| Dose amount (DOSEAMT) | “Hi have only | |
|
| Frequency (FREQ) | “Sorry Lisinopril was stopped HCTZ 25 mg added / | |
|
| Route (RUT) | “Lossrtan is giving me SOBg chest pressure on | |
|
| Duration (DRT) | “Lossrtan is giving me SOB chest pressure on inhalation is this l? Happens about | |
|
| Dose unit | “Good Morning can you call me in a refill for my bph
| |
aKeywords are italicized.
bMRI: magnetic resonance imaging.
cmeds: medications.
dThe MedEx semantic type for each term is included in parentheses.
eHCTZ: hydrochlorothiazide.
fOV: office visit.
gSOB: shortness of breath.
hbp: blood pressure.
Statistics according to patient and provider messages.
| Message type | Patient messages (n=5689), mean (SD) | Provider messages (n=6442), mean (SD) |
| Words per message | 17.01 (23.40) | 26.75 (28.79) |
| Sentences per message | 2.71 (2.02) | 3.11 (2.11) |
| Messages per user | 23.84 (26.22) | 521.83 (1588.20) |
Figure 2Characteristics of messages shown with a scatterplot image using word frequency and L2-penalized regression coefficients. Terms with higher usage are colored according to patients (blue) and providers (red). Terms with intermediate colors, such as green, yellow, and orange, reflect coefficients with values that have less of an association with patient or provider usage. Coef: coefficient; Reg: regression.
Figure 3Distribution of patient (left) and provider (right) messages according to major topics. LDA: latent Dirichlet allocation.
Distribution of patient and provider messages according to shared significant subtopics within each main topic.
| Number of LDAa topics by data set | Messages, n (%) | ||
|
| |||
|
| 1 | 2851 (50.11) | |
|
| 2 | 1893 (33.27) | |
|
| 3 | 564 (9.91) | |
|
| 4 | 49 (0.86) | |
|
| 5 | 0 (0) | |
|
| |||
|
| 1 | 3311 (51.40) | |
|
| 2 | 2466 (38.28) | |
|
| 3 | 503 (7.81) | |
|
| 4 | 22 (0.34) | |
|
| 5 | 0 (0) | |
aLDA: latent Dirichlet allocation.
Distribution of medication intent categories with examples from patient and provider messages.
| Message type and medication intent category | Messages, n (%) | Example messagea | Sublanguage featuresa | ||||
|
| |||||||
|
| Medication request | 134 (47.5) | “Yes I am. Sent in a new prescription for the 10 mg when we changed the dosage because I | ||||
|
| Medication taking | 79 (28.0) | “Sorry Lisinopril was | taking, | |||
|
| Medication location | 54 (19.1) | “You sent it to | ||||
|
| Medication question | 15 (5.3) | “So at what point would / should I start 5 mg of amlodipine or another | ||||
|
| |||||||
|
| Medication question | 173 (68.7) | “Hi - we got your | need_refilled, | |||
|
| Medication question response | 41 (16.3) | “I have not heard of amlodipine causing | typical_side_effects, | |||
|
| Medication refill question | 21 (8.3) | “Do you need refills on anything? do you need the enalipril refilled too? ok what do you | refill, refill_needed, medsk_need, refill_test refilled_treatment, | |||
|
| Medication change | 17 (6.7) | “Hi talked to Dr [**NAME**] | see_tomorrow, dose_question, week, | |||
aItalics indicate encoded features identified and shared by the example sentence and sublanguage features.
bHCTZ: hydrochlorothiazide.
cfreq: frequency.
ddoseamt: dose amount.
edin: drug ingredient.
fpcam: Perelman Center for Advanced Medicine.
gpah: Pennsylvania Hospital.
hhup: Hospital of the University of Pennsylvania.
iravdin: Ravdin building.
jdpn: drug product name.
kmeds: medications.
Figure 4Distribution of medication intents among patient messages (left) and provider messages (right) in the medLDA model. medLDA: medication-specific latent Dirichlet allocation.
Performance of majority class by topic classification.
| Message type and medication intent category | Recall | Precision | F1 score | |
|
| ||||
|
| Medication location | 0.833 | 0.682 | 0.749 |
|
| Medication question | —a | — | — |
|
| Medication request | 0.843 | 0.685 | 0.756 |
|
| Medication taking | 0.481 | 0.745 | 0.585 |
|
| ||||
|
| Medication change | — | — | — |
|
| Medication question | 0.965 | 0.726 | 0.829 |
|
| Medication question response | 0.342 | 0.636 | 0.445 |
|
| Medication refill question | — | — | — |
aThe class could not be predicted with this approach.