| Literature DB >> 35400028 |
Yuqi Zhang1, Bin Guo1, Yasan Ding1, Jiaqi Liu1, Chen Qiu1, Sicong Liu1, Zhiwen Yu1.
Abstract
The rapid dissemination of misinformation in social media during the COVID-19 pandemic triggers panic and threatens the pandemic preparedness and control. Correction is a crucial countermeasure to debunk misperceptions. However, the effective mechanism of correction on social media is not fully verified. Previous works focus on psychological theories and experimental studies, while the applicability of conclusions to the actual social media is unclear. This study explores determinants governing the effectiveness of misinformation corrections on social media with a combination of a data-driven approach and related theories on psychology and communication. Specifically, referring to the Backfire Effect, Source Credibility, and Audience's role in dissemination theories, we propose five hypotheses containing seven potential factors (regarding correction content and publishers' influence), e.g., the proportion of original misinformation and warnings of misinformation. Then, we obtain 1487 significant COVID-19 related corrections on Microblog between January 1st, 2020 and April 30th, 2020, and conduct annotations, which characterize each piece of correction based on the aforementioned factors. We demonstrate several promising conclusions through a comprehensive analysis of the dataset. For example, mentioning excessive original misinformation in corrections would not undermine people's believability within a short period after reading; warnings of misinformation in a demanding tone make correction worse; determinants of correction effectiveness vary among different topics of misinformation. Finally, we build a regression model to predict correction effectiveness. These results provide practical suggestions on misinformation correction on social media, and a tool to guide practitioners to revise corrections before publishing, leading to ideal efficacies.Entities:
Keywords: Cognitive factor; Correction; Data-driven; Microblog; Misinformation; Social media
Year: 2022 PMID: 35400028 PMCID: PMC8979789 DOI: 10.1016/j.ipm.2022.102935
Source DB: PubMed Journal: Inf Process Manag ISSN: 0306-4573 Impact factor: 7.466
Fig. 1Illustration of methodology.
Description of collected data.
| Category | Definition | Example of the correction post |
|---|---|---|
| Control measures in the epidemic | Relates to some control measures, such as transportation control, lockdown, resumption of work, return to school, vaccines, etc. | “On 9th March, it is widely spread online that XX University released a notice to inform senior students to return to school for graduation in June. Today, the college claims this message is fabricated. They mention that the school returning will be arranged in the near future.” |
| Situation of the epidemic | Be associated with new cases of COVID-19, virus transmission, isolation to potential virus carriers, etc. | “On 11th March, someone spread the message ‘one person was diagnosed with COVID-19 infection’, we solemnly declare that the news is a rumor. And we have reported to the police, the originator will take the legal responsibility for this.” |
| Medical knowledge | Common senses of medicine, virus prevention, virus self-test, etc. | “Recently there is a new online that taking a sip of water every 15 min to keep the throat moist can prevent the virus. According to specialists, there is no relationship between virus infection and dry throat, and drinking too much water may bring extra strain to the body.” |
| Supply and safety | About medical supplies, food safety and so on. | “China is a major producer of masks in the world, and its annual export volume remains stable at more than 70% of its production scale. China has never issued a ban on the export of masks and their raw materials, and enterprises can carry out trades by market-oriented principles. said Li Xinggan, director-general of the Department of Foreign Trade of the Ministry of Commerce.” |
Labeling and evaluation of collected data.
| Factor | Definition | Annotation standard |
|---|---|---|
| Proportion of original misinformation | The percentage of textual misinformation mentioned in correction text | Be calculated by the length of original misinformation mentioned in the post divided by the length of the post. |
| Length of the post | The length of the text | The number of characters of the post which excludes the special strings (e.g., URLs, emoticons, “@xxx”) and punctuations. |
| Explanation | Whether the post contains the explanation for why the misinformation is wrong or why originators of misinformation disseminated the false information. | “0”-no, “1”-yes, for the posts including multi-corrections, it is annotated as the number of corrections providing explanation divided by the total number of corrections mentioned in the post. |
| Graphic explanation | Whether contains the explanation in graphic form | “0”-no, “1”-yes, for the posts including multi-corrections, it is annotated as the number of corrections providing graphic explanation divided by the total number of corrections mentioned in the post. |
| Textual warnings of misinformation | Whether contains textual warnings before first mentioning the misinformation | “0”-no, “1”-yes, for the posts including multi-corrections, it is annotated as the number of corrections containing textual warnings divided by the total number of corrections mentioned in the post. |
| Graphic warnings of misinformation | Whether contains warnings in graphic form before first mentioning the misinformation | “0”-no, “1”-yes, for the posts including multi-corrections, it is annotated as the number of corrections containing graphic warnings divided by the total number of corrections mentioned in the post. |
| Influence of publisher | The influence of the publisher of the post | Be calculated as the sum of follower counts and other influence statistics, which is presented in Eq. |
| Category | Topics of the post | Be determined by the topic of the misinformation being corrected, for the posts including multi-corrections, it is decided by the majority |
Parameters of models.
| Algorithm | Parameter |
|---|---|
| SVR | kernal=’rbf’,C=0.6 |
| KNN | n_neighbors=15, p=1, weights=’distance’ |
| Random Forest | n_estimators=330,max_depth=10,min_samples_leaf=1,max_features=3 |
| XGBoost | learning_rate=0.05,n_estimators= 68,reg_alpha= 0.1, reg_lambda=0.9, gamma=0, subsample=0.75, colsample_bytree= 0.85,max_depth= 15, min_child_weight= 7 |
Spearman coefficient between potential factors and correction effectiveness.
| Factor | Coefficients | Sig. |
|---|---|---|
| Proportion of original misinformation | −0.033 | 0.208 |
| Length of the post | −0.085** | 0.001 |
| Explanation | 0.003 | 0.895 |
| Graphic explanation | 0.063* | 0.016 |
| Textual warnings of misinformation | −0.103** | 0.000 |
| Graphic warnings of misinformation | −0.104** | 0.000 |
| Influence of publisher | 0.484** | 0.000 |
Fig. 2Effects of “length of the post” and the “influence of publisher” on correction effectiveness.
Fig. 3Distributions of warnings and explanations in corrections.
Fig. 4Correction effectiveness and influencing factor in different topics of corrections.
Spearman coefficient between influencing factors and correction effectiveness in various topics of corrections.
| Factor | Control measures in the epidemic | Situation of the epidemic | Supply and safety | |||
|---|---|---|---|---|---|---|
| Coefficients | Sig. | Coefficients | Sig. | Coefficients | Sig. | |
| Proportion of original | −0.125* | 0.019 | 0.012 | 0.737 | −0.288** | 0.001 |
| Length of the post | 0.004 | 0.939 | −0.100** | 0.006 | 0.021 | 0.816 |
| Explanation | 0.023 | 0.662 | −0.013 | 0.725 | 0.120 | 0.193 |
| Graphic explanation | −0.006 | 0.911 | 0.126** | 0.001 | 0.049 | 0.592 |
| Textual warnings of | −0.194** | 0.000 | −0.091* | 0.012 | −0.290** | 0.001 |
| Graphic warnings of | −0.065 | 0.218 | −0.121** | 0.001 | −0.125 | 0.174 |
| Influence of publisher | 0.540** | 0.000 | 0.451** | 0.000 | 0.458* | 0.000 |
Fig. 5Distributions on social interactions and followers of users involved in the dissemination of the correction post.
Evaluations of trained regression models.
| Algorithm | Training set | Testing set | ||
|---|---|---|---|---|
| MAE | MAE | |||
| SVR | 0.0867 | 0.3127 | 0.0849 | 0.3690 |
| KNN | 0.0005 | 0.9978 | 0.0793 | 0.4035 |
| Random Forest | 0.0639 | 0.6277 | 0.0784 | 0.4585 |
| XGBoost | ||||
Feature importance of regression model.
| Factor | Random Forest | XGBoost |
|---|---|---|
| Feature importance | ||
| Category | 0.0895 | 0.1216 |
| Length of the post | 0.1982 | 0.0942 |
| Graphic explanation | 0.0212 | 0.0964 |
| Textual warnings of misinformation | 0.0639 | 0.3263 |
| Graphic warnings of misinformation | 0.0242 | 0.0843 |
| Influence of publisher | 0.6031 | 0.2772 |