| Literature DB >> 29925405 |
Junxiang Wang1, Liang Zhao1, Yanfang Ye2,3, Yuji Zhang4,5.
Abstract
BACKGROUND: Vaccine has been one of the most successful public health interventions to date. However, vaccines are pharmaceutical products that carry risks so that many adverse events (AEs) are reported after receiving vaccines. Traditional adverse event reporting systems suffer from several crucial challenges including poor timeliness. This motivates increasing social media-based detection systems, which demonstrate successful capability to capture timely and prevalent disease information. Despite these advantages, social media-based AE detection suffers from serious challenges such as labor-intensive labeling and class imbalance of the training data.Entities:
Keywords: Formal reports; Multi-instance learning; Social media; Vaccine adverse event detection
Mesh:
Substances:
Year: 2018 PMID: 29925405 PMCID: PMC6011255 DOI: 10.1186/s13326-018-0184-y
Source DB: PubMed Journal: J Biomed Semantics
Fig. 1Overview of the proposed framework. VAERS: Vaccine Adverse Event Reporting System. MILR: Multi-instance Logistic Regression
A formal report and tweet example, respectively
| Formal report | Tweet |
|---|---|
| T-dap 2 days ago | As soon as I walk |
| developed | in my apartment, |
| my | |
| decides to remind me | |
| by allergist referral sent. | I got a |
Keywords are shown in bold types
Model performance between no formal report and 2500 formal report based on five metrics (the highest value for each metric is highlighted in bold type): multi-instance learning methods outperformed baselines
| Method | Formal | ACC | PR | RE | FS | AUC |
|---|---|---|---|---|---|---|
| #Report | ||||||
| SVM(linear) | 0 | 0.7793 | 0.7309 | 0.6100 | 0.6644 | 0.7916 |
| 2500 | 0.7296 | 0.6241 | 0.6370 | 0.6294 | 0.7234 | |
| SVM(poly) | 0 | 0.6412 | 0.7231 | 0.3611 | 0.3069 | 0.5697 |
| 2500 | 0.5478 | 0.5311 | 0.5497 | 0.4443 | 0.6416 | |
| SVM(rbf) | 0 | 0.6507 | 0.6948 | 0.0572 | 0.1035 | 0.8069 |
| 2500 | 0.5897 | 0.4652 |
| 0.6210 | 0.7754 | |
| LR | 0 | 0.7665 | 0.6765 | 0.6641 | 0.6700 | 0.7524 |
| 2500 | 0.7322 | 0.6209 | 0.6576 | 0.6384 | 0.7303 | |
| NN | 0 | 0.7924 | 0.7408 | 0.6273 | 0.6790 | 0.8196 |
| 2500 | 0.7411 | 0.6414 | 0.6396 | 0.6394 | 0.7366 | |
| miFV | 0 | 0.7818 | 0.7269 | 0.6352 | 0.6775 | 0.8348 |
| 2500 | 0.7856 | 0.7331 | 0.6403 | 0.6833 | 0.8361 | |
| miVLAD | 0 | 0.7691 | 0.7261 | 0.5832 | 0.6461 | 0.8390 |
| 2500 | 0.7863 | 0.7055 | 0.6999 |
| 0.8201 | |
| MILR | 0 | 0.8034 | 0.7858 | 0.6231 | 0.6947 | 0.8676 |
| 2500 |
|
| 0.6291 | 0.6984 |
|
Fig. 2Receiver operating characteristic (ROC) curves adding different formal reports: multi-instance learning methods outperformed baselines no matter how many formal reports were added. a No formal report, b 500 formal reports, c 1000 formal reports, d 1500 formal reports, e 2000 formal reports, f 2500 formal reports
Fig. 3Metric trends of all classifiers adding different formal reports: formal reports improved the performance metrics of multi-instance learning methods consistently while affected the performance of baselines negatively. a SVM(linear), b SVM(poly), c SVM(rbf), d LR, e NN, f miFV, g miVLAD, h MILR
Model performance using MILR with smaller training sizes (the highest value for each metric is highlighted in bold type): the effect of formal reports was more obvious when the training size was smaller
| Twitter data | Formal | ACC | PR | RE | FS | AUC |
|---|---|---|---|---|---|---|
| #Training | #Report | |||||
| 314 (20%) | 0 | 0.7731 | 0.7278 | 0.5923 | 0.6525 | 0.8446 |
| 500 | 0.7812 | 0.7323 | 0.6212 | 0.6713 | 0.8539 | |
| 1000 | 0.8112 | 0.7993 | 0.6356 | 0.7076 | 0.8888 | |
| 1500 |
| 0.7935 | 0.6524 | 0.7151 | 0.8923 | |
| 2000 | 0.8114 | 0.7812 | 0.6612 |
| 0.8916 | |
| 2500 | 0.8112 | 0.7824 | 0.6590 | 0.7147 | 0.8904 | |
| 786 (50%) | 0 | 0.7939 | 0.7689 | 0.6141 | 0.6816 | 0.8646 |
| 500 | 0.7920 | 0.7651 | 0.6125 | 0.6790 | 0.8684 | |
| 1000 | 0.8041 | 0.7682 | 0.6567 | 0.7064 | 0.8834 | |
| 1500 | 0.8034 | 0.7720 | 0.6482 | 0.7031 | 0.8834 | |
| 2000 | 0.8092 | 0.7968 | 0.6312 | 0.7044 | 0.8897 | |
| 2500 | 0.8066 | 0.7711 |
| 0.7108 | 0.8866 | |
| 1048 (67%) | 0 | 0.7952 | 0.7841 | 0.5953 | 0.6767 | 0.8646 |
| 500 | 0.7850 | 0.7615 | 0.5915 | 0.6645 | 0.8653 | |
| 1000 | 0.7983 | 0.7948 | 0.5937 | 0.6795 | 0.8843 | |
| 1500 | 0.7996 | 0.7944 | 0.5992 | 0.6830 | 0.8880 | |
| 2000 | 0.8034 | 0.7984 | 0.6080 | 0.6903 | 0.8899 | |
| 2500 | 0.8060 |
| 0.6133 | 0.6949 | 0.8910 | |
| 1179 (75%) | 0 | 0.7952 | 0.7845 | 0.5927 | 0.6752 | 0.8664 |
| 500 | 0.7933 | 0.7695 | 0.6010 | 0.6743 | 0.8846 | |
| 1000 | 0.8034 | 0.7881 | 0.6172 | 0.6915 | 0.8948 | |
| 1500 | 0.8041 | 0.7913 | 0.6154 | 0.6915 | 0.8963 | |
| 2000 | 0.8041 | 0.7940 | 0.6119 | 0.6901 | 0.8983 | |
| 2500 | 0.8041 | 0.7940 | 0.6119 | 0.6901 |
|
Fig. 4Keyword frequencies of tweets which indicated AEs between no formal report and 2500 formal reports: frequent keywords remained stable. a No formal report, b 2500 formal reports
Two users and their corresponding tweets
| User Id | Corresponding tweets | Indicative or not |
|---|---|---|
| 246090881 | Got my annual employer-paid | Not |
| Now I have a | Indicative | |
| Starting to | Not | |
| 206180021 | Getting a | Not |
| Or Gamera! Gamera flies through the air like a spinning firework. Anyone who hates Gamera is | Not | |
| Personally, I don’t like something about the sound of “The Tower Heist” movie. Yup, something about that makes me | Not |
Keywords are displayed in bold types