| Literature DB >> 31682571 |
Zubair Shah1,2, Didi Surian1, Amalie Dyda1, Enrico Coiera1, Kenneth D Mandl3,4, Adam G Dunn1,4.
Abstract
BACKGROUND: Tools used to appraise the credibility of health information are time-consuming to apply and require context-specific expertise, limiting their use for quickly identifying and mitigating the spread of misinformation as it emerges.Entities:
Keywords: credibility appraisal; health misinformation; machine learning; social media
Mesh:
Substances:
Year: 2019 PMID: 31682571 PMCID: PMC6862002 DOI: 10.2196/14007
Source DB: PubMed Journal: J Med Internet Res ISSN: 1438-8871 Impact factor: 5.428
Figure 1The steps used to define the training dataset and automatically label Web pages.
Figure 2The proportion of Web pages that met the individual criteria in the 474 Web pages used to train the classifiers. cri: criterion.
The parameters and corresponding values for the initialization of the language model and classifier.
| Parameters | Value |
| Weight decay | 1.00E-04 |
| Backpropagation through time | 60 |
| Batch size | 52 |
| Dropouts | 0.25, 0.1, 0.2, 0.02, 0.15 |
| Embedding size | 400 |
| Number of layers | 3 (language model), 5 (classifier) |
| Optimizer | Adam |
| β1, β2 | 0.8, 0.99 |
Figure 3The performance difference of the language model (LM) for 2 different settings, including training loss (top-left), validation cross-entropy loss (top-right), and the accuracy of the LM predicting the next word in a sentence given previous words in the validation text (bottom).
The parameters used for support vector machine and random forest classifiers; all other parameters are kept as default.
| Parameters | Value | |
|
| ||
|
| C | 100 |
|
| Gamma | 1 |
|
| Kernel | linear |
|
| Norm | l1 |
|
| Use-idfa | TRUE |
|
| Max-dfb | 1 |
|
| N-gram range | (1,1) |
|
| ||
|
| N-estimators | 10 |
|
| Criterion | Gini |
|
| Min-impurity-split | 1.00E-07 |
aUse-idf: when true, term weights are scaled by the number of documents they appear in.
bMax-df: when set to 1, words that appear in every document are not removed.
Performance of the classifiers (average F1 score and accuracy in 10-fold cross-validation).
| Criterion | Deep learninga, mean (SD) | Support vector machinesa, mean (SD) | Random forestsa, mean (SD) | |||
|
| F1 score | Accuracy | F1 score | Accuracy | F1 score | Accuracy |
| 1 | 0.851 (0.005) | 0.740 (0.008) | 0.903 (0.032) | 0.842 (0.045) | 0.924 (0.019) | |
| 2 | 0.000 (0.000) | 0.638 (0.003) | 0.802 (0.044) | 0.828 (0.018) | 0.943 (0.006) | |
| 3 | 0.000 (0.000) | 0.865 (0.009) | 0.917 (0.011) | 0.745 (0.088) | 0.944 (0.018) | |
| 4 | 0.882 (0.001) | 0.789 (0.002) | 0.903 (0.042) | 0.833 (0.068) | 0.936 (0.022) | |
| 5 | 0.551 (0.249) | 0.486 (0.051) | 0.787 (0.034) | 0.721 (0.051) | 0.920 (0.020) | |
| 6 | 0.867 (0.002) | 0.765 (0.004) | 0.912 (0.006) | 0.852 (0.010) | 0.943 (0.004) | |
| 7 | 0.000 (0.000) | 0.840 (0.008) | 0.924 (0.006) | 0.764 (0.057) | 0.936 (0.004) | |
aThe classifier with the highest F1-score is italicized for each criterion.
Figure 4A subset of the terms that were informative of low-credibility scores in the training set of 474 Web pages. Terms at the top are those most over-represented in low-credibility Web pages compared with other Web pages, and terms at the bottom are those most under-represented in low-credibility Web pages compared with other Web pages. OR: odds ratio; Inf: infinity.
Figure 5The sum of tweets and retweets for links to included Web pages relative to the number of credibility criteria satisfied.
Figure 6The distribution of potential exposures per Web page for low (orange), medium (gray), and high (cyan) credibility scores, where low credibility includes scores from 0 to 2, and high credibility includes scores from 5 to 7.
Figure 7A network visualization representing the subset of 98,663 Twitter users who posted tweets including links to vaccine-related Web pages at least twice and were connected to at least one other user in the largest connected component. Users who posted at least 2 high-credibility Web pages and no low-credibility Web pages (cyan) and those who posted at least two low-credibility Web pages and no high-credibility Web pages (orange) are highlighted. The size of the nodes is proportional to the number of followers each user has on Twitter, and nodes are positioned by a heuristic such that well-connected groups of users are more likely to be positioned close together in the network diagram.