| Literature DB >> 35767322 |
Yahya Albalawi1,2,3, Nikola S Nikolov1, Jim Buckley1,3.
Abstract
BACKGROUND: In recent years, social media has become a major channel for health-related information in Saudi Arabia. Prior health informatics studies have suggested that a large proportion of health-related posts on social media are inaccurate. Given the subject matter and the scale of dissemination of such information, it is important to be able to automatically discriminate between accurate and inaccurate health-related posts in Arabic.Entities:
Keywords: BERT; bidirectional encoder representations from transformers; deep learning; health informatics; health information; infodemiology; language model; machine learning; misinformation; pretrained language models; social media; tweets
Year: 2022 PMID: 35767322 PMCID: PMC9280463 DOI: 10.2196/34834
Source DB: PubMed Journal: JMIR Form Res ISSN: 2561-326X
Summary of studies that analyzed the accuracy of health information on social media.
| Studies | Number of tweets or documents | Sources | Methods to label | Language covered | Percentage of the accuracy | Topics covered | Type of study |
| Swetland et al [ | 358 | Expert votes; relabeling in cases of disagreement | English | 25.4% inaccurate | COVID-19 | Exploratory | |
| Albalawi et al [ | 109 | Two physicians; delete if there is a disagreement | Arabic | 31% inaccurate | General | Quantitative pilot study | |
| Saeed et al [ | 208 | Expert votes; relabeling in cases of disagreement | Arabic | 38% inaccurate | Cancer | MLa | |
| Sharma et al [ | 183 | Two physicians; delete if there is a disagreement | English | 12% inaccurate | Zika | Quantitative | |
| Alnemer et al [ | 625 | Vote if the experts do not agree | Arabic | 50% inaccurate | Only tweets from health professionals | Quantitative and exploratory study | |
| Zhao et al [ | 5000 | Health forum | Annotator voting; in addition, consulted an expert to validate information labeled as misleading | Chinese | 11.4% misinformation | Autism | ML |
| Sell et al [ | 2460 | Coders checked the interagreement on 200 tweets | English | 10% inaccurate | Ebola | Quantitative | |
| Chew and Eysenbach [ | 5395 | Coder checked agreement on 125 tweets; unsubstantiated by the following reference standards: the CDCb and Public Health Agency of Canada for scientific claims and a panel of credible web-based news sources (eg, CNNc and BBCd) for news-related claims | English | 4.5% inaccurate | H1N1 | Exploratory | |
| Sicilia et al [ | 800 | Annotator’s agreement; relabeling in cases of disagreement; here, the definition for misinformation was “news items without a source” | English | Unknown | Zika | ML | |
| Kalyanam et al [ | 47 million | Type of hashtags | English | 25% of the analyzed tweets were speculative | Ebola | Quantitative | |
| Al-Rakhami and Al-Amri [ | 409,484 | Twitter; keywords | Although they used coders, their definition of a rumor included lack of a source; hence, unconfirmed information was automatically classified as uncredible; in addition, tweets were classified by only 1 coder who checked interagreement on 20 tweets | Not noted, but the keywords were in English | 70% uncredible | COVID-19 | ML |
| Elhadad et al [ | 7486 | Various websites | Fact-checking websites and official websites | English | 21% | COVID-19 | ML |
| Seltzer et al [ | 500 | Coders’ agreement | English | 23% | Zika | Exploratory | |
| Ghenai et al [ | 26,728 | Defined keywords to the extracted tweets based on rumors identified from the WHOe website; then, the coders labeled the tweets | English | 32% | Zika | ML |
aML: machine learning.
bCDC: Centers for Disease Control and Prevention.
cCNN: Cable News Network.
dBBC: British Broadcasting Corporation.
eWHO: World Health Organization.
Summary of studies that developed MLa models to detect the accuracy of health-related information.
| Study | ML approach | Results | Labeling type |
| Elhadad et al [ | Deep learning multimodel, GRUb, LSTMc, and CNNd | 99.99% (F1 score) | Ground truth data from websites |
| Ghenai et al [ | Random forest | 94.5% (weighted average for F1 score) | Crowdsource agreement but keywords are based on 4 WHOe website-identified rumors |
| Al-Rakhami and Al-Amri [ | Ensemble learning and random forest+SVMf | 97.8% (accuracy) | Single annotator only after confirming source |
| Zhao et al [ | Random forest | 84.4% (F1 score) | Annotator vote; in addition, consulted an expert to validate misleading information |
| Sicilia et al [ | Random forest | 69.9% (F1 score) | Agreement of a health expert |
| Saeed et al [ | Random forest | 83.5% (accuracy) | Agreement of a health expert |
aML: machine learning.
bGRU: gated recurrent unit.
cLSTM: long short-term memory.
dCNN: convolutional neural network.
eWHO: World Health Organization.
fSVM: support vector machine.
Figure 1Overview of the process followed in labeling tweets as either accurate or inaccurate [31]. ML: machine learning.
Figure 2Overview of the process used to train and select machine learning models. BLSTM: bidirectional long short-term memory.
Pretrained language models.
| Name | Basis | Size | Corpus |
| ARBERT [ | BERTa-base | 61 GB of MSAb text (6.5 billion tokens) |
Books and news (news and Wikipedia articles) |
| MARBERT [ | BERT-base | 128 GB of text (15.6 billion tokens) |
1 billion Arabic tweets |
| QARiB [ | BERT-base | 14 billion tokens; vocabulary: 64,000 |
420 million tweets and approximately 180 million sentences of text from Arabic Giga Word, Abulkhair Arabic Corpus, and OPUSc |
| ArabicBERT [ | BERT-base and BERT-large | 95 GB of text and 8.2 billion words |
Arabic OSCARd version, Wikipedia, and other resources |
| AraBERTv0.2e [ | BERT-base and BERT-large | 77 GB, 200,095,961 lines, 8,655,948,860 words, or 82,232,988,358 characters |
OSCAR unshuffled and filtered Arabic Wikipedia articles The 1.5 billion words Arabic Corpus The OSIANf corpus Assafir news articles |
| AraBERTv2g [ | BERT-base and BERT-large | 77 GB, 200,095,961 lines, 8,655,948,860 words, or 82,232,988,358 characters |
OSCAR, unshuffled and filtered Arabic Wikipedia articles The 1.5 billion words Arabic corpus The OSIAN corpus Assafir news articles |
aBERT: bidirectional encoder representations from transformers.
bMSA: Modern Standard Arabic.
cOPUS: open parallel corpus.
dOSCAR: Open Superlarge Crawled Aggregated corpus.
eAraBERTv0.2: Transformer-based Model for Arabic Language Understanding version 0.2version 0.2.
fOSIAN: Open Source International Arabic News.
gAraBERTv2: Transformer-based Model for Arabic Language Understanding version 0.2 version 2.
Comparison of the performance of machine learning models for detecting the accuracy of health-related tweets.
| Model and class | Precision | Recall | F1 score | Macroaverage | Model accuracy | ||||||
|
| |||||||||||
|
| Inaccurate | 0.804 | 0.7627 | 0.7826 | 0.8279 | 0.8397 | |||||
|
| Accurate | 0.86 | 0.8866 | 0.8731 | 0.8279 | 0.8397 | |||||
|
| |||||||||||
|
| Inaccurate | 0.8276 | 0.8136b | 0.8205b | 0.8564b | 0.8654b | |||||
|
| Accurate | 0.8878 | 0.8969 | 0.8923 | 0.8564b | 0.8654b | |||||
|
| |||||||||||
|
| Inaccurate | 0.8519 | 0.7797 | 0.8142 | 0.8543b | 0.8654b | |||||
|
| Accurate | 0.8725 | 0.9175 | 0.8945 | 0.8543b | 0.8654b | |||||
|
| |||||||||||
|
| Inaccurate | 0.8448 | 0.8305d | 0.8376d | 0.8701d | 0.8782d | |||||
|
| Accurate | 0.898d | 0.9072 | 0.9025d | 0.8701d | 0.8782d | |||||
|
| |||||||||||
|
| Inaccurate | 0.7759 | 0.7627 | 0.7692 | 0.8154 | 0.8269 | |||||
|
| Accurate | 0.8571 | 0.866 | 0.8615 | 0.8154 | 0.8269 | |||||
|
| |||||||||||
|
| Inaccurate | 0.7903 | 0.8305d | 0.8099 | 0.8447 | 0.8526 | |||||
|
| Accurate | 0.8936 | 0.866 | 0.8796 | 0.8447 | 0.8526 | |||||
|
| |||||||||||
|
| Inaccurate | 0.7797 | 0.7797 | 0.7797 | 0.8228 | 0.8333 | |||||
|
| Accurate | 0.866 | 0.866 | 0.866 | 0.8228 | 0.8333 | |||||
|
| |||||||||||
|
| Inaccurate | 0.8654 | 0.7627 | 0.8108 | 0.8532 | 0.8654b | |||||
|
| Accurate | 0.8654 | 0.9278b | 0.8955b | 0.8532 | 0.8654b | |||||
|
| |||||||||||
|
| Inaccurate | 0.8913d | 0.6949 | 0.781 | 0.83492 | 0.8525 | |||||
|
| Accurate | 0.8364 | 0.9485d | 0.8889 | 0.83492 | 0.8525 | |||||
|
| |||||||||||
|
| Inaccurate | 0.7719 | 0.7458 | 0.7586 | 0.8079 | 0.8205 | |||||
|
| Accurate | 0.8485 | 0.866 | 0.8571 | 0.8079 | 0.8205 | |||||
|
| |||||||||||
|
| Inaccurate | 0.8542 | 0.6949 | 0.7664 | 0.8222 | 0.8397 | |||||
|
| Accurate | 0.8333 | 0.9278b | 0.8780 | 0.8222 | 0.8397 | |||||
|
| |||||||||||
|
| Inaccurate | 0.8261 | 0.6441 | 0.7238 | 0.7919 | 0.8141 | |||||
|
| Accurate | 0.8091 | 0.9175 | 0.8148 | 0.7919 | 0.8141 | |||||
|
| |||||||||||
|
| Inaccurate | 0.7925 | 0.7119 | 0.75 | 0.805 | 0.8205 | |||||
|
| Accurate | 0.835 | 0.8866 | 0.86 | 0.805 | 0.8205 | |||||
|
| |||||||||||
|
| Inaccurate | 0.6865 | 0.7797 | 0.7302 | 0.7737 | 0.7821 | |||||
|
| Accurate | 0.8571 | 0.866 | 0.8172 | 0.7737 | 0.7821 | |||||
|
| |||||||||||
|
| Inaccurate | 0.7313 | 0.8305d | 0.7777 | 0.8136 | 0.8205 | |||||
|
| Accurate | 0.8144 | 0.8144 | 0.8494 | 0.8136 | 0.8205 | |||||
|
| |||||||||||
|
| Inaccurate | 0.8158 | 0.5254 | 0.6392 | 0.7382 | 0.7756 | |||||
|
| Accurate | 0.7627 | 0.9278b | 0.8372 | 0.7382 | 0.7756 | |||||
aAraBERTv2: Transformer-based Model for Arabic Language Understanding version 2.
bRepresents the second-best value.
cAraBERTv0.2: Transformer-based Model for Arabic Language Understanding version 0.2.
dIndicates the best value.
eBERT: bidirectional encoder representations from transformers.
fBLSTM: bidirectional long short-term memory.
gCBOW: Continuous Bag of Words.