| Literature DB >> 32570124 |
Lina Sulieman1, Jamie R Robinson2, Gretchen P Jackson2.
Abstract
BACKGROUND: Patient portals are consumer health applications that allow patients to view their health information. Portals facilitate the interactions between patients and their caregivers by offering secure messaging. Patients communicate different needs through portal messages. Medical needs contain requests for delivery of care (e.g. reporting new symptoms). Automating the classification of medical decision complexity in portal messages has not been investigated.Entities:
Keywords: Consumer health; Decision complexity; Machine learning; Medical billing; Medical decision; Patient portals
Mesh:
Year: 2020 PMID: 32570124 PMCID: PMC7303623 DOI: 10.1016/j.jss.2020.05.039
Source DB: PubMed Journal: J Surg Res ISSN: 0022-4804 Impact factor: 2.192
Factors of medical decision-making.
| Diagnosis or management options | Amount and complexity of data | Level of risk of complications | Complexity of decision-making |
|---|---|---|---|
| Minimal | Minimal or none | Minimal | Straightforward |
| Limited | Limited | Low | Low |
| Multiple | Moderate | Moderate | Moderate |
| Extensive | Extensive | High | High |
Complexity of decision-making is based on the three categories listed in the first three columns. At least 2 or 3 of those factors should be met.
The parameter search space used to find optimal model for Naive Bayes and random forest.
| Machine learning algorithm | Parameter name | Parameter possible values |
|---|---|---|
| Multinomial Naïve Bayes | Smoothing parameter alpha | 0-1, 0 no smoothing |
| Fitting prior: Learning class prior probabilities | True, false. If false, uniform prior is applied | |
| Random forest | Number of estimators | 20,30,40,50,60,70,80,90,100 |
| Maximum depth | None (nodes expanded until all leaves are pure),3,4,5,6,7,8,9,10 | |
| Minimum samples split | 2,3,4,5 | |
| Maximum leaf nodes | 5,6,7,8,9,10,11,12 |
The precision values of machine learning models tuned on precision value.
| Class | No machine learning | Multinomial Naïve Bayes | Random forest | ||||
|---|---|---|---|---|---|---|---|
| Medical term number | Bag of word | TF-IDF | Medical terms | Bag of word | TF-IDF | Medical terms | |
| No decision | 0 | 0.62 | 0.62 | 0.53 | 0.5 | 0 | |
| Straightforward | 0.4 | 0.65 | 0.57 | 0.6 | 0.54 | 0.49 | |
| Low | 0.2 | 0.29 | 0.6 | 0 | 0 | 0 | |
| Moderate | 0 | 0 | 0 | 0 | 0 | 0 | |
| Micro average | 0.14 | 0.57 | 0.57 | 0.55 | 0.53 | 0.49 | |
| Macro average | 0.28 | 0.38 | 0.39 | 0.28 | 0.26 | 0.12 | |
| Weighted | 0.31 | 0.57 | 0.47 | 0.43 | 0.24 | ||
Bold text represents the highest precision for the identifying the corresponding medical decision complexity class.
The recall values of the machine learning models tuned on recall values.
| Class | No machine learning | Multinomial Naïve Bayes | Random forest | ||||
|---|---|---|---|---|---|---|---|
| Medical term number | Bag of word | TF-IDF | Medical terms | Bag of word | TF-IDF | Medical terms | |
| No decision | 0 | 0.5 | 0.5 | 0 | |||
| Straightforward | 0.17 | 0.67 | 0.71 | 0.57 | 0.79 | 1 | |
| Low | 0.25 | 0.25 | 0.25 | 0.25 | 0.12 | 0 | |
| Moderate | 0 | 0 | 0 | 0 | 0 | 0 | |
| Micro average | 0.14 | 0.57 | 0.59 | 0.57 | 0.57 | 0.49 | |
| Macro average | 0.35 | 0.39 | 0.4 | 0.45 | 0.35 | 0.16 | |
| Weighted | 0.14 | 0.57 | 0.59 | 0.58 | 0.57 | 0.32 | |
Bold text represents the highest precision for the identifying the corresponding medical decision complexity class.
The parameters of the optimal model tuned on both precision and recall.
| Machine learning algorithm | Multi Naïve Bayes | Random forest |
|---|---|---|
| Tuning based on precision | ||
| Bag of words | Alpha: 0.9, fit prior: True | Maximum depth: 9, maximum leaf nodes: 9, minimum samples split: 4, estimators number: 100 |
| TF-IDF | Alpha: 0.5, fit prior: False | Maximum depth: None, maximum leaf nodes: 12, minimum samples split: 2, estimators number: 30 |
| Medical terms | Alpha: 0.7, fit prior: True | Maximum depth: 8, maximum leaf nodes: 10, minimum samples split: 3, estimators number: 20 |
| Tuning based on recall | ||
| Bag of words | Alpha: 0.7, fit prior: True | Maximum depth: 6, maximum leaf nodes: 9, minimum samples split: 3, estimators number: 90 |
| TF-IDF | Alpha: 0.5, fit prior: False | Maximum depth: 8, maximum leaf nodes: 12, minimum samples split: 2, estimators number: 20 |
| Medical terms | Alpha: 0.9, fit prior: True | Maximum depth: 8, maximum leaf nodes: 12, minimum samples split: 2, estimators number: 20 |
Fig. 1The confusion matrices for models tuned on precision. (A) confusion matrix of Naive Bayes trained on bag of words. (B) confusion matrix of Naive Bayes trained on TF-IDF. (C) confusion matrix of Naive Bayes trained on medical terms. (D) confusion matrix of random forest trained on bag of words. (E) confusion matrix of random forest trained on TF-IDF. (F) confusion matrix of random forest trained on medical terms. The x-axis is the model's predicted decision complexity class, the y-axis is the true decision complexity class. The darker color corresponds to a higher precision value. The brighter color corresponds to a lower precision value. (Color version of figure is available online.)
Fig. 2Confusion matrices for models tuned using recall. (A) confusion matrix of Naive Bayes trained on bag of words. (B) confusion matrix of Naive Bayes trained on TF-IDF. (C) confusion matrix of Naive Bayes trained on medical terms. (D) confusion matrix of random forest trained on bag of words. (E) confusion matrix of random forest trained on TF-IDF. (F) confusion matrix of random forest trained on medical terms. The x-axis is the model's predicted decision complexity class, the y-axis is the true decision complexity class. The darker color corresponds to a higher recall value. The brighter color corresponds to a lower recall value. (Color version of figure is available online.)