| Literature DB >> 30855231 |
Jinying Chen1,2, John Lalor2,3, Weisong Liu2,4, Emily Druhl2, Edgard Granillo1,2, Varsha G Vimalananda2,5, Hong Yu2,3,4,6.
Abstract
BACKGROUND: Improper dosing of medications such as insulin can cause hypoglycemic episodes, which may lead to severe morbidity or even death. Although secure messaging was designed for exchanging nonurgent messages, patients sometimes report hypoglycemia events through secure messaging. Detecting these patient-reported adverse events may help alert clinical teams and enable early corrective actions to improve patient safety.Entities:
Keywords: adverse event detection; drug-related side effects and adverse reactions; hypoglycemia; imbalanced data; natural language processing; secure messaging; supervised machine learning
Mesh:
Year: 2019 PMID: 30855231 PMCID: PMC6431826 DOI: 10.2196/11990
Source DB: PubMed Journal: J Med Internet Res ISSN: 1438-8871 Impact factor: 5.428
Figure 1Workflow of HypoDetect. ROS: random oversampling; SMOTE: synthetic minority oversampling technique; SVM: support vector machine; TF-IDF: term frequency-inverse document frequency.
Figure 2Equations for (1) synthetic minority oversampling technique, (2) inverse document frequency (IDF), (3) term frequency-inverse document frequency (TF-IDF), and (4) F1 measure.
Performance of 3 variants of HypoDetect systems on the evaluation set.
| Systems | AUC-ROCa | Precision | Sensitivity (recall) | Specificity | F1 score | Accuracy | |
| Rule-based method | 0.815 | 0.284 | 0.491 | 0.951 | 0.360 | 0.934 | |
| Baseline | 0.945 | 0.614 | 0.377 | 0.991 | 0.467 | 0.966 | |
| Class weighting | 0.952 | 0.529 | 0.561 | 0.980 | 0.545 | 0.964 | |
| RUS-ensembleb | 0.949 | 0.198 | 0.921 | 0.852 | 0.326 | 0.855 | |
| ROS-ensemblec | 0.950 | 0.559 | 0.500 | 0.984 | 0.528 | 0.966 | |
| SMOTE-ensembled | 0.951 | 0.564 | 0.500 | 0.985 | 0.530 | 0.966 | |
| Baseline | 0.942 | 0.000 | 0.000 | 1.000 | 0.000 | 0.962 | |
| Class weighting | 0.927 | 0.428 | 0.570 | 0.970 | 0.489 | 0.955 | |
| RUS-ensemble | 0.928 | 0.143 | 0.904 | 0.787 | 0.248 | 0.791 | |
| ROS-ensemble | 0.931 | 0.318 | 0.728 | 0.938 | 0.443 | 0.930 | |
| SMOTE-ensemble | 0.942 | 0.486 | 0.596 | 0.975 | 0.535 | 0.961 | |
| Baseline | 0.947 | 0.660 | 0.307 | 0.994 | 0.419 | 0.968 | |
| Class weighting | 0.954 | 0.513 | 0.693 | 0.974 | 0.590 | 0.963 | |
| RUS-ensemble | 0.946 | 0.192 | 0.912 | 0.849 | 0.318 | 0.851 | |
| ROS-ensemble | 0.951 | 0.536 | 0.526 | 0.982 | 0.531 | 0.965 | |
| SMOTE-ensemble | 0.951 | 0.566 | 0.552 | 0.983 | 0.559 | 0.943 | |
aAUC-ROC: area under the receiver operating characteristic curve.
bRUS-ensemble: ensemble models using random undersampling.
cROS-ensemble: ensemble models using random oversampling.
dSMOTE-ensemble: ensemble models using synthetic minority oversampling technique.
Performance of different HypoDetect systems implemented by using all types of features or by respectively dropping each individual type of feature.
| Systems | AUC-ROCa | Precision | Sensitivity (recall) | Specificity | F1 score | Accuracy | |
| All | 0.952 | 0.529 | 0.561 | 0.980 | 0.545 | 0.964 | |
| Without TF-IDFb | 0.920 | 0.263 | 0.737 | 0.919 | 0.388 | 0.912 | |
| Without topic | 0.949 | 0.569 | 0.579 | 0.983 | 0.574 | 0.967 | |
| Without domain relevance | 0.928 | 0.348 | 0.623 | 0.954 | 0.447 | 0.941 | |
| All | 0.942 | 0.486 | 0.596 | 0.975 | 0.535 | 0.961 | |
| Without TF-IDF | 0.938 | 0.364 | 0.632 | 0.956 | 0.462 | 0.944 | |
| Without topic | 0.935 | 0.392 | 0.640 | 0.961 | 0.487 | 0.949 | |
| Without domain relevance | 0.901 | 0.365 | 0.237 | 0.984 | 0.287 | 0.955 | |
| All | 0.954 | 0.513 | 0.693 | 0.974 | 0.590 | 0.963 | |
| Without TF-IDF | 0.917 | 0.248 | 0.754 | 0.910 | 0.373 | 0.904 | |
| Without topic | 0.950 | 0.500 | 0.640 | 0.975 | 0.561 | 0.962 | |
| Without domain relevance | 0.901 | 0.437 | 0.579 | 0.971 | 0.498 | 0.956 | |
aAUC-ROC: area under the receiver operating characteristic curve.
bTF-IDF: term frequency-inverse document frequency.
cSMOTE-ensemble: ensemble models using synthetic minority oversampling technique.