| Literature DB >> 35729952 |
Dheeraj Kumar Dixit1, Amit Bhagat1, Dharmendra Dangi1.
Abstract
In recent years, rumours and fake news are spreading widely and very rapidly all over the world. Such circumstances lead to the propagation and production of an inaccurate news article. Also, misinformation and fake news are increased by the user without proper verification. Hence, it is necessary to restrict the spreading of fake information on mass media and to promote confidence all over the world. For this purpose, this paper recognizes the detection of fake news in an effective manner. The proposed methodology in detecting fake news consists of four different phases namely the data pre-processing phase, feature reduction phase, feature extraction phase as well as the classification phase. During data pre-processing, the input data are pre-processed by employing tokenization, stop-words deletion as well as stemming. In the second phase, the features are reduced by employing PPCA to enhance accuracy. Then the extracted feature is provided to the classification phase where LSTM-LF algorithm is utilized to classify the news as fake or real optimally. Furthermore, this paper utilizes four different datasets namely the Buzzfeed dataset, GossipCop dataset, ISOT dataset as well as Politifact dataset for evaluation. The performance evaluation and the comparative analysis are conducted and the analysis reveals that the proposed approach provides better performances when compared to other fake detection-based approaches.Entities:
Keywords: Classification; Extraction; Fake news; LF distribution; LSTM; PPCA; Pre-processing; Reduction
Year: 2022 PMID: 35729952 PMCID: PMC9202495 DOI: 10.1007/s00500-022-07215-4
Source DB: PubMed Journal: Soft comput ISSN: 1432-7643 Impact factor: 3.732
Review of prior literature works
| References | Techniques | Simulation metrics | Datasets | Merits | Limitations | Transparency |
|---|---|---|---|---|---|---|
| Cui et al. ( | dEFEND | Precision, F1-measure, accuracy, MAP | GossipCop and PolitiFact | Enhanced detection performances | Failed to consider the posts and explainable comments | Extensively outperformed 7 state-of-the-art fake news detection techniques by at least 5.33% in F1-score |
| Yuan et al. ( | SMAN | Accuracy, precision, Recall, F1-measure | Weibo, Twitter-15 and Twitter-16 | High rate of accuracy | Requires time during execution | Fake news detected in 4 h with 91% accuracy |
| Duan et al. ( | MDP-LSTM | RMSE, MAPE, MSE, MAE | HDFS, Hadoop, Spark, Open stack | High rate of robustness and accuracy | Failed to integrate log keywords in LSTM | Achieved 80% accuracy and average test speed is 1.648 × 10−3 kb/s |
| Ozbay and Alatas ( | Supervised artificial intelligence algorithms | Accuracy, precision, Recall, F1-measure | Buzz Feed, Random political datasets, ISOT fake news | High rate of detection accuracy | Failed to integrate ensemble approaches | Evaluated a two step identification for fake news detection |
| Kesarwani et al. ( | K-nearest neighbour classifier | True label, accuracy, F1-measure | Buzz feed | Classification accuracy was high | Complex during implementation | Classification accuracy tested again facebook post was 79% |
| Shu et al. ( | Hierarchical propagation network | Accuracy, precision, Recall, F1-measure | GossipCop and PolitiFact | Highly robust | Unsupervised fake news detection | Outperformed significantly with state-of-the-art fake news detection techniques by at least 1.7% with an average F1 > 0.84 |
| Wang et al. ( | SemSeq4FD | Average and maximum length, accuracy, precision, Recall | LUN and SLN, Weibo and RCED | Recognized multi-view fake news | Poor flexibility rate | Single dimensional network to learn local sentence representations |
| Vijjali et al. ( | Dual Stage Transformer Model | Accuracy, precision, MAP | COVID-19 dataset | High efficiency | Overall effectiveness is poor | Ability to run in real time with inexpensive compute |
| Zhang et al. ( | BDANN | Accuracy, precision, Recall, F1-measure | Twitter and Weibo | Enhanced fake news detection | Failed to design probabilistic model | Achieved 82% accuracy |
| Khattar et al. ( | MVAE | Accuracy, precision, Recall, F1-measure | Twitter and Weibo | Enhanced fake news detection | Failed to propagate the twitter data | Accuracy boosted by more than 2.05% and 1.90% |
Fig. 1Architecture of the PPCA and Levy Flight based LSTM fake news detection system
Fig. 2LSTM Architecture
Fig. 3Proposed LSTM-LF approach for fake news detection
Parameter specifications
| Methods | Parameter metrics | Ranges |
|---|---|---|
| PPCA | Size of the population | 50 |
| Maximum number of iterations | 100 | |
| Total number of principal components | 3 | |
| Distance metrics | Euclidean distance | |
| Selection of initial points | Random selection of two different points | |
| PCCR | 92% | |
| LSTM-LF | Network type | Fully-connected |
| Rate of dropout | 0.3 | |
| Optimizer function | Adam | |
| Activation | Softmax | |
| Total number of epochs | 200 | |
| Size of the population | 50 | |
| Maximum number of iterations | 100 | |
| Step length | ||
| Comparative scalar value | 0.5 |
Training and testing specifications
| Datasets | Total data | Training data | Testing data |
|---|---|---|---|
| Buzzfeed | 1700 | 1000 | 700 |
| GossipCop | 17,520 | 17,020 | 500 |
| ISOT | 44,900 | 44,000 | 900 |
| Politifact | 700 | 500 | 200 |
Confusion matrix regarding fake news
| Confusion matrix | Actual class | ||
|---|---|---|---|
| Positive class | Negative class | ||
| Predicted class | Positive class | TP | FP |
| Negative class | FN | TN | |
Fig. 4Analysis of datasets for a accuracy, b specificity, c precision and d recall
Fig. 5Performance rate analysis
Fig. 6Overall performance analysis of the proposed approach
Fig. 7Comparative analysis for a accuracy, b specificity, c precision and d recall