| Literature DB >> 35910294 |
Pawan Kumar Verma1,2, Prateek Agrawal1,3, Vishu Madaan1, Radu Prodan3.
Abstract
Online social media enables low cost, easy access, rapid propagation, and easy communication of information, including spreading low-quality fake news. Fake news has become a huge threat to every sector in society, and resulting in decrements in the trust quotient for media and leading the audience into bewilderment. In this paper, we proposed a new framework called Message Credibility (MCred) for fake news detection that utilizes the benefits of local and global text semantics. This framework is the fusion of Bidirectional Encoder Representations from Transformers (BERT) using the relationship between words in sentences for global text semantics, and Convolutional Neural Networks (CNN) using N-gram features for local text semantics. We demonstrate through experimental results a popular Kaggle dataset that MCred improves the accuracy over a state-of-the-art model by 1.10% thanks to its combination of local and global text semantics.Entities:
Keywords: Convolutional neural network; Deep learning; Dense network; Fake news classification; Global semantic; Natural language processing; Social media disinformation; Text classification; local semantic
Year: 2022 PMID: 35910294 PMCID: PMC9325949 DOI: 10.1007/s12652-022-04338-2
Source DB: PubMed Journal: J Ambient Intell Humaniz Comput
GloVe model training details
| Corpora name | Number of tokens (billion) | Vocabulary size | Model size |
|---|---|---|---|
| Wikipedia 2014 Gigaword 5 | 6 | 400,000 | 822 MB |
| Common Crawl | 42 | 1.9 M | 1.75 GB |
| Common Crawl | 840 | 2.2 M | 2.03 GB |
| 27 | 1.2 M | 1.42 GB |
BERT model configuration
| Model | Layers | Hidden size | Self attention heads | Parameters |
|---|---|---|---|---|
| BERT Base | 12 | 768 | 12 | 110M |
| BERT Large | 24 | 1024 | 16 | 340M |
Fig. 1MCred sequence flow diagram
Fig. 2BERT processing layer
Fig. 3CNN processing layer
Fig. 4Dense net processing layer
MCred model architecture
| Processing layer | Parameter | Value |
|---|---|---|
| Number of dense layers | 4 | |
| Dropout rate | 0.5 | |
| Activation function | ReLU | |
| Number of dense layers | 1 | |
| Number of Conv1D layers | 3 | |
| Number of global average pool layers | 3 | |
| Activation function | ReLU | |
| Kernel size | 1,2,3 | |
| Number of dense layers | 2 | |
| Dropout rate | 0.5 | |
| Batch size | 64 | |
| Optimizer | Adam | |
| Activation function | Sigmoid | |
| Loss | Binary-cross entropy |
Training time
| Processing unit | Time (in seconds) |
|---|---|
| CPU | 10,800 |
| GPU | 3,600 |
Dataset description
| Dataset | Fake news | Real news | Characteristics |
|---|---|---|---|
| Kaggle (Lifferth | 10369 | 10349 | Contains news title, text and author name |
| McIntire (Hamel , | 3164 | 3171 | News articles related to 2016 US presidential election |
| FakeNews (Risdal | 24396 | 13614 | News collected from heterogeneous sources and topics |
| WELFake (Verma et al. | 37106 | 35028 | Minimizes the limitations of other individual dataset |
Fake news prediction parameters
| Evaluation parameter | Predictive value | Actual value |
|---|---|---|
| True-positive (TP) | Yes | Yes |
| True-negative (TN) | No | No |
| False-positive (FP) | Yes | No |
| False-negative (FN) | No | Yes |
MCred model results on various optimizers
| Parameter | Optimizer | Val_Loss | Val_Acc | Testing dataset | |||
|---|---|---|---|---|---|---|---|
| Accuracy | Precision | Recall | F1-Score | ||||
| Dropout (0.5) | Adam | ||||||
| SGD | 0.3898 | 0.8299 | 0.8258 | 0.8220 | 0.8161 | 0.8190 | |
| RMSProp | 6.9984 | 0.5107 | 0.5071 | 0.5071 | 0.9901 | 0.6729 | |
| Adagrad | 0.5159 | 0.7620 | 0.7587 | 0.7530 | 0.7477 | 0.7503 | |
| Dropout (0.3) | Adam | 0.0442 | 0.9858 | 0.9852 | 0.9823 | 0.9881 | 0.9851 |
| SGD | 0.3905 | 0.8208 | 0.8164 | 0.8718 | 0.7199 | 0.7886 | |
| RMSProp | 0.1408 | 0.9553 | 0.9481 | 0.9410 | 0.9516 | 0.9463 | |
| Adagrad | 0.4551 | 0.7854 | 0.7844 | 0.7831 | 0.7680 | 0.7755 | |
Fig. 5Model accuracy and loss.
Hyper-parameters for ML models tuning
| Model | Parameter | Value |
|---|---|---|
| LR | C Penalty Solver | 0.01 l2 lbfgs |
| NB | Smoothing Fit prior | 1 true |
| DT | Maximum features Criterion Cost complexity pruning | auto gini 0.02 |
| RF | n_estimators | 50 |
| XGBoost | n_estimators learning_rate | 100 0.01 |
MCred model performance comparison with other ML models
| Model Name | Accuracy (%) |
|---|---|
| Logistic Regression (LR) | 89.46 |
| Naive Bayes (NB) | 92.38 |
| Decision Tree (DT) | 93.56 |
| Random Forest (RF) | 94.12 |
| XGBoost | 97.65 |
MCred model performance comparison with other DL models
| BERT-CNN | 99.01 |
| BERT-RNN | 94.56 |
| BERT-LSTM | 96.94 |
Comparison of MCred with state-of-the-art methods
|
Mersinias et al. ( |
Khan and Alhazmi ( |
Kaliyar et al. ( | Rohit Kumar Kaliyar and Narang ( | MCred | |
|---|---|---|---|---|---|
| Dataset accuracy | Kaggle: 97.52% McIntire: 94.53% FakeNews: 96.78% | Kaggle: 90.70% | Kaggle: 98.36% | Kaggle: 98.90% | Kaggle: 99.46% McIntire: 97.16% FakeNews: 97.98% WELFake: 99.01% |
| Document representation features | Class label frequency distance vector | Doc2Vec | GloVe | BERT embeddings | GloVe – BERT embeddings |
| Classifier | Logistic regression (ML) CNN + LSTM (DL) | AdaBoost LinearSVM | Deep CNN | CNN | CNN, BERT |