| Literature DB >> 35746333 |
Hina Fatima Shahzad1, Furqan Rustam2, Emmanuel Soriano Flores3,4, Juan Luís Vidal Mazón3,5, Isabel de la Torre Diez6, Imran Ashraf7.
Abstract
Deep learning is used to address a wide range of challenging issues including large data analysis, image processing, object detection, and autonomous control. In the same way, deep learning techniques are also used to develop software and techniques that pose a danger to privacy, democracy, and national security. Fake content in the form of images and videos using digital manipulation with artificial intelligence (AI) approaches has become widespread during the past few years. Deepfakes, in the form of audio, images, and videos, have become a major concern during the past few years. Complemented by artificial intelligence, deepfakes swap the face of one person with the other and generate hyper-realistic videos. Accompanying the speed of social media, deepfakes can immediately reach millions of people and can be very dangerous to make fake news, hoaxes, and fraud. Besides the well-known movie stars, politicians have been victims of deepfakes in the past, especially US presidents Barak Obama and Donald Trump, however, the public at large can be the target of deepfakes. To overcome the challenge of deepfake identification and mitigate its impact, large efforts have been carried out to devise novel methods to detect face manipulation. This study also discusses how to counter the threats from deepfake technology and alleviate its impact. The outcomes recommend that despite a serious threat to society, business, and political institutions, they can be combated through appropriate policies, regulation, individual actions, training, and education. In addition, the evolution of technology is desired for deepfake identification, content authentication, and deepfake prevention. Different studies have performed deepfake detection using machine learning and deep learning techniques such as support vector machine, random forest, multilayer perceptron, k-nearest neighbors, convolutional neural networks with and without long short-term memory, and other similar models. This study aims to highlight the recent research in deepfake images and video detection, such as deepfake creation, various detection algorithms on self-made datasets, and existing benchmark datasets.Entities:
Keywords: deep learning; deepfake; image processing; video altering
Mesh:
Year: 2022 PMID: 35746333 PMCID: PMC9230855 DOI: 10.3390/s22124556
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
A comparative analysis of review/survey papers on deepfakes.
| Reference | Type | DF Detection | DF Creation | DF Tweets | Timeline | Published | Scope |
|---|---|---|---|---|---|---|---|
| [ | Survey | Image/Video | Image/Video | No | 2019 | Arxiv | Covers deepfake creation and detection approaches presented from 2017 to 2020, however, there are few studies from 2020. |
| [ | Survey | Image | Image | No | 2020 | Elsevier | The survey covers the studies on face manipulation approaches only and does not include deepfake video creation and detection. |
| [ | Survey | Image/Video | Image/Video | No | 2021 | Arxiv | Recent studies on face synthesis, attribute manipulation, identity swap, and expression swap are discussed, in addition to the deepfake datasets. |
| [ | Survey | Image/Video | Image/Video | No | 2021 | ACM | Focuses on the potential of various deep learning networks for creating and detecting deepfakes. Similarly, well-known architectures from different studies are discussed. |
| [ | Survey | Image/Video | Image/Video | No | 2021 | Springer | Covers a brief overview of different deepfake creation and detection tools and covers a small range of studies. |
| Current | SLR | Image/Video | Image/Video | Yes | 2021 | Sensors | Focuses on the recent works regarding deepfake creation and detection techniques. In addition to images and videos, it covers deepfake tweets. Many recent studies are covered regarding famous deepfake apps and approaches. |
Figure 1Research article search and selection methodology.
Figure 2Flow chart of paper selection methodology.
Details of the selected articles with respect to sub-topics.
| Topic | No. of Articles |
|---|---|
| Deepfake Creation | 18 |
| Deepfake Detection | 20 |
| Deepfake Video Detection Using Image Processing Techniques | 7 |
| Deepfake Video Detection Using Physiological Signals | 4 |
| Deepfake Video Detection Using Biological Signals | 1 |
| Deepfake Audio Detection | 5 |
| Deepfake Image Detection | 7 |
| Deepfake Tweet Detection | 1 |
| Total | 58 |
Figure 3Examples of original and deepfake videos.
Figure 4Deepfake generation process using encoder–decoder pair [40].
Figure 5Architecture of DeepFaceLab from [41].
Figure 6Training and testing phases of FC-GAN [46].
Brief overview of deepfake face apps.
| Tool | Link & Key Features |
|---|---|
| DeepFaceLab | – |
| FSGAN | – |
| DiscoFaceGAN | – |
| FaceShifter | – |
| AvatarMe | – |
| “Do as I Do” Motion Transfer | – |
Figure 7Types of deepfake videos and detection process.
Figure 8Deepfake detection using CNN and LSTM [75].
Figure 9Deepfake and original image: Original image (left), deepfake (right) [92].
Figure 10Deepfake and GANprintR-processed deepfake: (a) Deepfake, (b) deepfake after GANprintR [93].
Comparison table of self-made datasets.
| Reference | Dataset | Classifier | Method | Performance |
|---|---|---|---|---|
| [ | Own dataset created [ | SVM | FP extraction method: | HOG 95%, |
| [ | UADFV | SVM | 3D head pose | 97.4% AUC |
| [ | Self-made dataset | DNN | Eyeblink + LRCN | 99% AUC |
| [ | Own dataset | CNN and LSTM | CNN_LSTM | 97.1% AUC |
| [ | Self-made dataset | KNN, SVM, and LDA | AttGAN, StarGAN, GDWCT, | 99.81% from StyleGAN2 |
| [ | 100K-Faces (StyleGAN) | Deep learning | CNN | EER = 0.3% from |
| [ | DFFD (ProGAN, StyleGAN) | Deep learning | CNN + attention mechanism | AUC = 100%, |
| [ | Own (Adobe Photoshop) | Deep learning features | DRN | AP = 99.8% |
| [ | Own (ProGAN, Adobe Photoshop) | Deep learning features | CNN | AUC = 99.9%, |
| [ | Self-made dataset | Machine learning | Eye blinking | 87.5% |
| [ | Own (Celebrity Retouching, ND-IIITD Retouching) | Deep learning features (face patches) | RBM | CR= 96.2%, |
| [ | Own TweepFake | Machine learning | LR_BOW, | ROBERTA_FT 89.6%, |
| [ | Own Deep Fakes dataset | CNN | Biological signals | 91.7% |
Comparison of benchmark datasets.
| Reference | Dataset | Classifier | Method | Performance |
|---|---|---|---|---|
| [ | Deepfake Forensics | CNN | DFT-MF | Deepfake Forensics dataset 71.25% |
| [ | FaceForensics++ dataset | CNN | CNN XceptionNet | At 1.3x background scale 94.33% accuracy |
| [ | DeepfakeTIMIT (LQ) | PCA+RNN | Audio-visual features | DeepfakeTIMIT (LQ) EER = 3.3% |
| [ | FaceForensics dataset | Logistic regression | Visual features | 86.6% LR |
| [ | FaceForensics++, | CNN, | FSSPOTTER | FaceForensics++ 100%, |
| [ | DFD Celeb-DF, | CNN | XceptionNet | Transfer learning: With transfer learning 86.49%, |
| [ | Celeb-DF-FaceForensics++ (c23) | CNN | YOLO-CNN-XGBoost | 90.62% AUC, |
| [ | DFO, FSh, CDF, | ResNet-18 and MS-TCN | Semantic irregularities | 82.4% for CDF, 73.5% for DFDC, 97.1% for FSh, and 97.6% for DFo datasets. |
| [ | FF++, DFDC, and CDF | CNN | Multi-attentional framework | 97.60% for FF++, 67.44% for CDF and 0.1679 Logloss for DFDC. |
| [ | DFD, CDF, and FF++ | NN and CNN | Multi-feature fusion | 99.73% for FF++, 92.53% for DFD, and 75.07% for CDF dataset. |
| [ | DFDC | NN and CNN | NN compression | 93.9% for DFDC dataset. |
| [ | J48 | TIMIT-DF, DFD, | Feature fusion | 94.21% for TIMIT-DF, 96.36% for DFD, |
| [ | 3D CNN | FF++, TIMIT HQ, | Channel transformation | 99.83% for FF++, 99.28% for TIMIT HQ, |