| Literature DB >> 33014173 |
Md Rafiqul Islam1, Shaowu Liu1, Xianzhi Wang2, Guandong Xu1.
Abstract
Recently, the use of social networks such as Facebook, Twitter, and Sina Weibo has become an inseparable part of our daily lives. It is considered as a convenient platform for users to share personal messages, pictures, and videos. However, while people enjoy social networks, many deceptive activities such as fake news or rumors can mislead users into believing misinformation. Besides, spreading the massive amount of misinformation in social networks has become a global risk. Therefore, misinformation detection (MID) in social networks has gained a great deal of attention and is considered an emerging area of research interest. We find that several studies related to MID have been studied to new research problems and techniques. While important, however, the automated detection of misinformation is difficult to accomplish as it requires the advanced model to understand how related or unrelated the reported information is when compared to real information. The existing studies have mainly focused on three broad categories of misinformation: false information, fake news, and rumor detection. Therefore, related to the previous issues, we present a comprehensive survey of automated misinformation detection on (i) false information, (ii) rumors, (iii) spam, (iv) fake news, and (v) disinformation. We provide a state-of-the-art review on MID where deep learning (DL) is used to automatically process data and create patterns to make decisions not only to extract global features but also to achieve better results. We further show that DL is an effective and scalable technique for the state-of-the-art MID. Finally, we suggest several open issues that currently limit real-world implementation and point to future directions along this dimension. © Springer-Verlag GmbH Austria, part of Springer Nature 2020.Entities:
Keywords: Decision making; Deep learning; Misinformation detection; Neural network; Online social networks
Year: 2020 PMID: 33014173 PMCID: PMC7524036 DOI: 10.1007/s13278-020-00696-x
Source DB: PubMed Journal: Soc Netw Anal Min
Fig. 1Social relationships between different users
Fig. 2Key types related to misinformation
Fig. 3Sample responses to a rumor claim. Figure courtesy by Ma et al. (2019)
Comparison of different types of misinformation
| Type | Characteristics | Objectiveness | Severity | Integrity | References |
|---|---|---|---|---|---|
| Rumors | Ambiguous | Not sure | Low | Not sure | Shu et al. ( |
| False information | Deception | Yes | High | False | Kumar and Shah ( |
| Fake news | Misguided | Yes | Medium | False | Sharma et al. ( |
| Spam | Confused | Yes | Low | Not sure | Çıtlak et al. ( |
| Disinformation | Mislead/deceive | Yes | Medium | False | Guo et al. ( |
List of popular deep learning framework
| Framework | Key Point | Interface Support | CNN/RNN Support | References |
|---|---|---|---|---|
| Caffe | Caffe is one of the most popular deep learning network | C, C++, Python, MATLAB | Yes |
Jia et al. ( |
| Torch | Because of using the fast scripting language LuaJIT, torch provides faster performance than other frameworks | C/C++ | Yes |
Collobert et al. ( |
| PyTorch | PyTorch is a port to torch deep learning framework | Python | Yes |
Ketkar ( |
| DL4j | DL4j uses for text-mining, NLP, and image recognition | Java, Scala, and JVM | Yes |
Parvat et al. ( |
| Neon | Neon is designed to ease of use and for extensibility | Python | Yes |
Pouyanfar et al. ( |
| TensorFlow | TensorFlow is one of the best deep learning frameworks for natural language processing, speech recognition, image processing | Python, C++ and R | Yes |
Abadi et al. ( |
| Keras | Keras is a part of the TensorFlow core API and uses for text generation, summarization, and classification | Python | Yes |
Chollet ( |
| CNTK | To train deep learning models, CNTK is an open-source deep learning framework for for image, speech, and text-based data | Python, C++ | Yes |
Shi et al. ( |
| Theano | Theano allows users to define, optimize, and evaluate mathematical expressions on arrays and tensors | C, Python | Yes |
Van Merriënboer et al. ( |
| Dlib | Dlib is an independent cross-platform open source software | C++ | CNN-Yes RNN-No |
King ( |
| Torch | Torch is a machine learning open source software library which provides a large number of algorithms for deep learning | C | Yes |
Collobert et al. ( |
| BigDL | It is a distributed deep learning based framework | Python | Yes |
Dai et al. ( |
Deep learning methods showing the performances within the applications of social network research
| Models | Input Data | User Response | Problem Tackled | References | |
|---|---|---|---|---|---|
| Text | Visual | ||||
| CNN+LSTM | Disinformation detection | Dhamani et al. ( | |||
| LSTM+BiLSTM | False claim detection | Popat et al. ( | |||
| RCNN | False information detection | Wu et al. ( | |||
| BiLSTM | Misinformation detection | Zhang et al. ( | |||
| RNN+ GRU | Fake news detection | Shu et al. ( | |||
| CNN+Attention | Review spam detection | Gong et al. ( | |||
| CNN+LSTM | Spam detection | Shahariar et al. ( | |||
| LSTM+Attention | Early rumor detection | Chen et al. ( | |||
| Attention | Misinformation identification | Liu et al. ( | |||
| LSTM+Attention | Fake news detection | Popat et al. ( | |||
| RNN | Fake news detection | Ruchansky et al. ( | |||
| CNN | Misinformation identification | Jia et al. ( | |||
| LSTM+Attention | Rumor detection | Guo et al. ( | |||
| RNN | Rumor detection | Jin et al. ( | |||
| GRU | Rumor detection | Li et al. ( | |||
| CNN+GRU | Early detection of fake news | Liu and Wu ( | |||
| RNN | Rumor detection | Ma et al. ( | |||
| CNN+LSTM | Rumor detection | Nguyen et al. ( | |||
Fig. 4Classification of deep learning models
Fig. 5RNN architecture used for fake news detection. Figure courtesy by Shu et al. (2019a)
Fig. 6The architecture of the generative adversarial learning model. Figure courtesy by Ma et al. (2019)
Fig. 7An illustration of the hybrid model. Figure courtesy by Ruchansky et al. (2017)
Summary of datasets used by existing efforts
| Dataset | Problem Tackled | Text Content | Number of Instances | Number of Classes | Ground Truth | References |
|---|---|---|---|---|---|---|
| BuzzfeedPolitical | Fake news detection | 120 | 2 |
Silverman et al. ( | ||
| LIAR | Fake news detection | 12.8K | 6 |
Wang ( | ||
| CREDBANK | Fact extraction | 4856 | 2 |
Mitra and Gilbert ( | ||
| FakeNewsNet | Rumor detection | CNN |
Shu et al. ( | |||
| Rumor detection | 1111 | 2 |
Ma et al. ( | |||
| PHEME | Rumor detection | 6425 | 2 |
Aiello et al. ( | ||
| NewsFN-2014 | Fake news detection | 221 | 5 |
Nan et al. ( | ||
| PolitiFact | Fake news detection | 488 | 2 |
Bathla et al. ( | ||
| Rumor detection | 816 | 2 |
Ma et al. ( | |||
| YelpChi | Fake review detection | 67K | 2 |
Mukherjee et al. ( | ||
| YelpNYC | Spam detection | 359K | 2 |
Rayana and Akoglu ( | ||
| YelpZip | Spam detection | 608K | 2 |
Rayana and Akoglu ( | ||
| Twitter dataset | Spam detection | 5.5M | 2 |
Concone et al. ( | ||
| KaggleEmergenta | Rumor detection | 2145 | 3 | |||
| KaggleFNb | Fake news detection | 13K | 1 | |||
| FacebookHoaxc | Hoax detection | 15.5K | 2 |
Tacchini et al. ( | ||
| BuzzfeedNews | Misleading detection | 2282 | 4 |
Silverman et al. ( | ||
| Enron emaid | Disinformation detection | .5M | 2 |
Dhamani et al. ( | ||
| Fraudulent emaie | Disinformation detection | 2500 | 2 |
Dhamani et al. ( | ||
| Italian dataset | Disinformation detection | 160K | 2 |
Pierri et al. ( |
ahttps://www.kaggle.com/arminehn/rumor-citation
bhttps://www.kaggle.com/mrisdal/fake-news
chttps://github.com/gabll/some-like-it-hoax/tree/master/dataset
dhttps://www.kaggle.com/wcukierski/enron-email-dataset
ehttps://www.kaggle.com/rtatman/fraudulent-email-corpus