| Literature DB >> 35840972 |
Marieke M van Buchem1,2,3, Olaf M Neve4, Ilse M J Kant5,6,7, Ewout W Steyerberg6,7, Hileen Boosman8, Erik F Hensen4.
Abstract
BACKGROUND: Evaluating patients' experiences is essential when incorporating the patients' perspective in improving healthcare. Experiences are mainly collected using closed-ended questions, although the value of open-ended questions is widely recognized. Natural language processing (NLP) can automate the analysis of open-ended questions for an efficient approach to patient-centeredness.Entities:
Keywords: Natural language processing; Patient satisfaction; Patient-centered care; Sentiment analysis; Unsupervised machine learning
Mesh:
Year: 2022 PMID: 35840972 PMCID: PMC9284859 DOI: 10.1186/s12911-022-01923-5
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 3.298
Fig. 1Overview of the different tasks and phases
Description of the vestibular schwannoma care pathway in the LUMC
| Vestibular schwannomas are benign intracranial tumors, with a heterogeneous clinical presentation: it may present as a small, slow growing, and asymptomatic tumor, but also as large, faster growing, and potentially fatal disease. Patients typically present with symptoms of hearing loss, loss of balance and vertigo, but may also suffer from facial numbness, facial paralysis, or elevated intracranial pressure. In non-progressive tumors, active surveillance with MRI is usually the management option of choice. In progressive tumors, surgery or radiotherapy is performed to prevent future complications. After an active intervention, prolonged active surveillance ensues in these patients too, in order to identify possible recurrences. The LUMC is an expert referral center for vestibular schwannoma in the Netherlands. The care is organized in an integrated practice unit including all specialties involved in the diagnosis and treatment (i.e., neurosurgery, otorhinolaryngology, radiology and radiation oncology). |
Overview of the number of AI-PREM responses per sentiment
| Questions | Number of patients N (%) | Average PEM scores of matched questions, ranging from 1 to 10 µ ± sd |
|---|---|---|
Q1–Positive Negative | 359 (67.2%) 26 (4.9%) | 9.7 ± 0.9 8.1 ± 2.4** |
Q2–Positive Negative | 360 (67.4%) 31 (5.8%) | 9.7 ± 0.7 7.7 ± 2.6** |
Q3–Positive Negative | 325 (60.9%) 40 (7.5%) | 9.6 ± 1.1 8.3 ± 1.8* |
Q4–Positive Negative | 343 (64.2%) 39 (7.3%) | 6.9 ± 1.7 6.4 ± 2.0 |
Q5–Positive Negative | 121 (22.7%) 35 (6.6%) |
The neutral responses are left out. Per category (question and sentiment), the average scores to the PEM questions that matched the AI-PREM questions are shown. P-value for the t-test for independent samples: * = p < 0.001, ** = p < 0.0001. AI-PREM: artificial intelligence patient reported experience measure. PEM: patient experience monitor. Q: question. sd: standard deviation
Most important improvements that were made during the iterative development process
| – To first perform a sentiment analysis and then create a separate topic model per sentiment and per question, instead of creating one topic model for both sentiments. This led to more specific topics, from which points of improvements could be derived more easily, increasing the interpretability and actionability |
| – To not only include the negative feedback topics but also the positive ones, in order to obtain more balanced information. This was found to be essential in selecting and prioritizing points of improvement. In addition, the positive topics were seen as motivators for the healthcare team |
| – To go from a fixed number of topics to an adaptive approach that automatically chooses the optimal number of topics per subject. This increased the completeness |
| – To add a quantitative dimension to the qualitative output of the topic model, in order to help prioritize aspects of care that need the most attention |
| – To include n-grams up to three instead of just using 1 g. This increased the interpretability and actionability of the topics |
Fig. 2Overview of the input, models, and output of the AI-PREM tool
Fig. 3Topic model for Q5
Representativeness of the different topic models per category
| Question | Positive categories in total | Per topic | Negative categories in total | Per topic |
|---|---|---|---|---|
| Q1 | 94.4% (n = 72) | T1: 100% (n = 36) T2: 88.9% (n = 36) | 55.6% (n = 18) | T1: 60% (n = 10) T2: 50% (n = 8) |
| Q2 | 93.3% (n = 75) | T1: 97.1% (n = 35) T2: 100% (n = 10) T3: 85% (n = 20) T4: 90% (n = 10) | 71% (n = 31) | T1: 100% (n = 3) T2: 100% (n = 3) T3: 83.3% (n = 6) T4: 100% (n = 3) T5: 75% (n = 4) T6: 28.6% (n = 7) T7: 60% (n = 5) |
| Q3 | 98.4% (n = 63) | T1: 100% (n = 43) T2: 95% (n = 20) | 76.9% (n = 39) | T1: 100% (n = 4) T2: 33.3% (n = 3) T3: 85.7% (n = 7) T4: 100% (n = 5) T5: 66.7% (n = 3) T6: 77.8% (n = 9) T7: 62.5% (n = 8) |
| Q4 | 100% (n = 65) | T1: 100% (n = 41) T2: 100% (n = 12) T3: 100% (n = 12) | 86.7% (n = 15) | T1: 100% (n = 5) T2: 80% (n = 10) |
| Q5 | 86.2% (n = 29) | T1: 85.7% (n = 21) T2: 87.5% (n = 8) | 55.5% (n = 20) | T1: 50% (n = 10) T2: 60% (n = 10) |
Representativeness is defined as the number of texts within a certain topic that fit the description of the topic. The percentage is calculated by dividing the texts that fit the description of the topic by the total number of texts within the topic. Q: AI-PREM question. T: automatically extracted topic
Fig. 4a Stage 1: the spider plot showing the percentage of positive and negative texts per question. Stage 2: once the end-user clicks on one of the questions, the automatically extracted topics are shown. The positive topics are shown on the left and the negative topics on the right. b Stage 3: if the end-user wants to dive into one of the topics, they can click on that topic and read the actual patient answers that belong to that topic. In this example, the end-user is looking at the topics within the ‘Other’ category and has clicked on positive topic 1 and negative topic 1