| Literature DB >> 35336373 |
Pilla Vaishno Mohan1, Shriniket Dixit1, Amogh Gyaneshwar1, Utkarsh Chadha2, Kathiravan Srinivasan1, Jung Taek Seo3.
Abstract
With information systems worldwide being attacked daily, analogies from traditional warfare are apt, and deception tactics have historically proven effective as both a strategy and a technique for Defense. Defensive Deception includes thinking like an attacker and determining the best strategy to counter common attack strategies. Defensive Deception tactics are beneficial at introducing uncertainty for adversaries, increasing their learning costs, and, as a result, lowering the likelihood of successful attacks. In cybersecurity, honeypots and honeytokens and camouflaging and moving target defense commonly employ Defensive Deception tactics. For a variety of purposes, deceptive and anti-deceptive technologies have been created. However, there is a critical need for a broad, comprehensive and quantitative framework that can help us deploy advanced deception technologies. Computational intelligence provides an appropriate set of tools for creating advanced deception frameworks. Computational intelligence comprises two significant families of artificial intelligence technologies: deep learning and machine learning. These strategies can be used in various situations in Defensive Deception technologies. This survey focuses on Defensive Deception tactics deployed using the help of deep learning and machine learning algorithms. Prior work has yielded insights, lessons, and limitations presented in this study. It culminates with a discussion about future directions, which helps address the important gaps in present Defensive Deception research.Entities:
Keywords: computational intelligence; deep learning; defensive deception; honeypots; machine-learning; moving target defense
Mesh:
Year: 2022 PMID: 35336373 PMCID: PMC8952217 DOI: 10.3390/s22062194
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Review articles of the CI-enabled techniques in Defensive Deception (✓: Yes, ×: No).
| Ref. | Year | No. of Articles | Brief on Focus (One-Sentence Summary) | CI-Enabled Techniques | Open Challenges | Future Directions | |
|---|---|---|---|---|---|---|---|
| Machine Learning | Deep Learning | ||||||
| [ | 2011 | 28 | A Review of Classification Approaches Using Support Vector Machine in Intrusion Detection | ✓ | × | × | ✓ |
| [ | 2012 | 191 | Review article on Nature-Inspired Techniques in the Context of Fraud Detection | ✓ | × | × | ✓ |
| [ | 2012 | 72 | Review article on employment of Data Mining Techniques for financial frauds detection. | ✓ | × | ✓ | ✓ |
| [ | 2013 | 62 | A review article on Computational Intelligence Models for Insurance Fraud Detection | ✓ | × | × | ✓ |
| [ | 2015 | 91 | A review on application of AI techniques for combatting cybercrime | ✓ | × | ✓ | ✓ |
| [ | 2018 | 77 | A survey of Artificial Intelligence in Cyber security | ✓ | ✓ | × | ✓ |
| [ | 2018 | 41 | Review article on employment of machine learning techniques for financial frauds detection. | ✓ | × | × | ✓ |
| [ | 2018 | 111 | A Survey article on Cyber Defensive Techniques employed with the help of Machine Learning algorithms | ✓ | × | ✓ | ✓ |
| [ | 2019 | 380 | A review of defensive tools and technologies employed in cyberspace | ✓ | × | ✓ | ✓ |
| [ | 2019 | 173 | A Survey on implementation of adaptive technologies in Moving Target Defense | ✓ | × | ✓ | ✓ |
| [ | 2020 | 65 | A review article on the implantation of Artificial Intelligence technologies in Electronic Warfare | ✓ | ✓ | × | ✓ |
| [ | 2020 | 145 | A Survey article on the implementation of AI, machine learning, and blockchain technology in IoT security | ✓ | × | ✓ | ✓ |
| [ | 2020 | 75 | A review of deception technologies used in cyber security and user privacy. | ✓ | × | ✓ | ✓ |
| [ | 2020 | 83 | Review article on AI and machine learning for cybersecurity | ✓ | ✓ | × | ✓ |
| [ | 2020 | 175 | A Survey article on Moving Target Defenses in order to implement Network Security | ✓ | × | ✓ | ✓ |
| [ | 2021 | 187 | A Review of Defensive Deception techniques Employed with the help of Game Theory and Machine Learning. | ✓ | ✓ | × | ✓ |
| Our Review | 2022 | 77 | Our review has briefly described various prominent ML and DL models and their use in Deception Technologies. | ✓ | ✓ | ✓ | ✓ |
Figure 1PRISMA flow diagram for the selection process of the research articles used in this review.
Figure 2Current machine learning models in defensive deception—nomenclature.
A summary of works on machine learning techniques in defensive deception.
| Ref. | Deception-Category | Machine Learning Approaches Used | Key Contribution | Limitations |
|---|---|---|---|---|
| [ | Honeypots, honey webs, honeynets, honey flies, HMAC, Moving target defense, obfuscation. | K-Means, Support Vector Machine, Hierarchical Grouping, Expectation-Maximization (EM), Bayesian Network (Bayes Net), Decision Tree (DT), Naïve-Bayes Algorithm, C4.5 Algorithm. | This work is primarily concerned with reviewing game-theoretic and machine learning-based Defensive Deception approaches and addressing the findings, limits, and lessons learned from this comprehensive study. | Various deep learning and machine learning approaches such as genetic algorithms, Ensemble Models, Self-organising maps, etc., were not taken into account for Deception. |
| [ | Moving target defense | Ensemble model used | This research first classified various Moving Target Defenses according to the surfaces on which these defenses operate. Secondly, they talked about how these MTDs can be put into effect. | The survey did not consider better machine learning and deep learning approaches to implement moving target defenses. |
| [ | Honeypot | C4.5, Decision Tree, Naive-Bayes and Bayes Net. | They employed a machine learning method to predict the most vulnerable and easily attackable host in an SDN (Software Defined Networking) network. The security rules for the SDN controller can be developed using the prediction output of machine learning algorithms to prevent unauthorized user access. The experiments revealed that machine learning techniques could enhance security rules for SDN controllers by properly anticipating potential susceptible hosts. The Bayesian Network achieved about 91.68 percent of average prediction accuracy. | New machine learning approaches such as neutrosophic sets were not taken into consideration. |
| [ | honeypots | Logistic Regression, SVM, KNN, Naive Bayes, ensemble-based models, Random Forest with Gini, and Extra Tree classifiers with Gini. | They demonstrated that fraudulent clicks on Instagram might boost the popularity index of posts through a variety of tactics with their research. They used honeypots and botnets to launch assaults and collect data from various real and false accounts, such as clicks on various posts. Experimental data show that LR is the most accurate predictor among all the single-based approaches, and among all ensemble-based methods, Random Forest is the best. | They did not consider various other approaches such as hybrid learning models, ANN, etc., in order to validate whether a view is legitimate or fake based on the chosen criteria. |
| [ | Obfuscation, Honeypot | Naïve Bayes | They methodically cataloged and ranked the available information system deception options, both offensively and defensively. Then they thought about how Defensive Deceptions could be packaged into “generic explanations” that an attacker would find more persuasive than individual refusals to accept directives. | Latest and better machine learning approaches were not used. |
| [ | Obfuscation, Honeypot | Decision Tree | A unique deception strategy was developed for network defenses that achieve reactive unpredictability by combining security postures and probabilistic decision trees. They developed a new grammar for decision-tree that allows analysts to specify and identify potential responses based on warnings, mission processes, security postures, and various asset conditions. A real-time simulation based on an organization and its activities and a historical dataset were used to implement, demonstrate, and assess our technique. | A probabilistic decision system can learn optimal decision tree order execution and security postures. Trees that are manually or automatically generated should potentially be improved to boost speed, especially as they grow larger. Attacks are not learned in the current implementation. |
| [ | Moving target defense | Genetic algorithm | They conducted a thorough study of MTD techniques, their core classifications, important design features, frequent attack behaviors addressed by existing MTD implementations. The literature also explored various application fields for the MTD techniques. | This article only briefly investigated the relationship between MTD and other defense systems. There has been little research that looks into the influence of MTD on minimizing attacks after the reconnaissance stage. There has not been much research into the best way to use numerous hybrid MTD approaches. Existing MTD methodologies have limitations in monitoring several parameters of a system’s quality. |
| [ | Honeypots | Support vector machine | They described the creation of a novel honeypot-based social bot in order to detect malicious profiles present in social networking groups. Their overall study goal is to look at techniques and propose effective solutions to automatically recognize and filter the profiles of harmful people who target social networking platforms. In order to attract fraudulent accounts, their strategy employs social honeypot personas. | The SVM algorithm used in this article is not suitable for large datasets. It does not perform very well when the dataset has more noise which is the usual case for Twitter accounts. |
| [ | Perturbation | Artificial neural network | They demonstrated how ANN might be used to modestly adjust the output probabilities by perturbing the final activation layer of the model. The opponent is forced to ignore the class probabilities, making it necessary to use more queries before successfully performing an attack. | Other machine learning and deep learning approaches were not considered for implementing the system. |
| [ | Honeypot | Decision tree | A decision tree is more useful when we have a honeynet rather than just one. Then we may independently test other techniques to determine how well they work and what risks they entail. This is achieved by calculating the average benefit for several honeypots and honeynet layouts, and the one with the highest average benefit is chosen. | Other machine learning algorithms were not used to examine the various scenarios generated by honeynet. |
Figure 3Current deep learning models in defensive deception—nomenclature.
A summary of works on Deep Learning Models in Defensive Deception.
| Ref. | Deception-Category | Deep Learning Models Used | Key Contribution | Limitations |
|---|---|---|---|---|
| [ | Money related deception | There is also a new term, Honeyfile, used in this article. Honeyfiles are also used to create confusion and apprehension about the value and location of sensitive data. This method is based on humans’ inability to discern between authentic and bogus information. | There comes a time when cyber security is being scrutinized by the public due to an increasing number of occurrences, even though only a fraction of these instances can be traced back to particular individuals or groups of Blackhats. | |
| [ | Honeypots, Perturbation | Online Adaptive Metric Learning | Because honeypots are completely “fake systems,” there are a variety of methods available to determine whether the present system is a honeypot or not. They are built with this underlying restriction in mind. | |
| [ | Honeypots | Recurrent neural network | This study describes a distributed infrastructure capable of deploying decoys across different network segments and managing their physical world perspectives. This solution’s prototype implementation and use case for a boiler model are only two examples of how this new methodology could be used. | To better understand and improve the situation, more research is required. Betterment of fidelity of decoys by generating vendor/product-specific characteristics that include things such as protocols used, ports used, and register point settings. |
| [ | Moving target defense, perturbation | Deep neural and deep convolution neural network | They offered MT Deep, a cybersecurity architecture influenced by MTD, as a security service to improve the SAFETY of Deep Neural Network-based classification systems in this study (DNNs). To design the interaction among both MT Deep and users, they used a Bayesian Stackelberg Game. The equilibrium provides the best alternative to the multi-objective problem of lowering misclassified rates on adversarial changed visuals while retaining better classification accuracy on photos images that have not been disturbed. | This article did not examine other neural networks, such as RNN, self-organizing maps, etc. |
| [ | Moving target defense | Deep neural network, deep convolution network, and deep reinforcement learning. | The authors have labeled the architecture of RL-based CRM (RL-CRM) according to the types of vulnerabilities it attempts to address. They have shown that the RL-CRM can set up moving target defense, engage attackers for reconnaissance, and lead human attention to mitigate visual weaknesses adaptively and autonomously. Their research revealed that posture-related defense technologies are well-developed, but mitigation options for information-related and human-induced vulnerabilities are still in the early stages of development. | The first hurdle in the learning process is to deal with system and performance limits. Many system limits exist in cyber systems that must be explicitly considered. The improvement of learning speed is a second difficulty. CRM’s (Cyber-Resilient Mechanism) purpose is to restore the cyber system following an attack. Fast learning would allow for a more rapid and resilient response to an attack. Dealing with the non-stationarity of cyber systems is the third difficulty. The environment is assumed to be stationary and ergodic in traditional RL algorithms. |
| [ | Honeypot, obfuscation | Deep neural network, deep reinforcement learning | They first introduced SRG (System Risk Graph), a precise adversarial model for extracting specific dangers and internet treatments, such as vulnerabilities in the software and virtualization layers. The adversarial model is updated based on the existing condition system. They proposed a deception rate, which is a statistical parameter for evaluating the efficiency of the deployment method based on SRG. Second, they tweaked a DRL algorithm to develop an adjustable decoy deployment strategy for a rapidly changing internet. Finally, they compared the proposed methodology to existing research using simulations. | This article did not analyze other neural networks such as recurrent neural networks, convolution neural networks, etc. |
| [ | Honeypot, obfuscation | Deep neural network, Online Adaptive Metric Learning | A machine learning-based framework for evaluating cyber deception defenses with minimum human participation is developed and implemented. This avoids the problems that come with fraudulent research. Humans, ensuring that automated evaluations are as effective as possible, must be completed prior to human study. Only after this can the next step begin. | They were unable to apply labels to previously unknown categories automatically. |
| [ | Moving target defense | Deep neural network, deep convolution network | They conducted a thorough study of MTD techniques, their core classifications, important design features, frequent attack behaviors addressed by current MTD techniques, and implementation found in this article. | This article only briefly investigated the relationship between MTD and other defense systems. There has been little research that looks into the influence of MTD on minimizing attacks after the reconnaissance stage. There has not been much research into the best way to use numerous hybrid MTD approaches. Existing MTD methodologies have limitations in monitoring several parameters of a system’s quality. |
List of various Defense Deception datasets.
| Ref. | Year | Authors | Dataset Used | Dataset Size | Format | Details about the Dataset/Brief Description |
|---|---|---|---|---|---|---|
| [ | 2012 | Ali Shiravi, Mahbod Tavallaee, Hadi Shiravi, Ali A. Ghorbani, | ISCXIDS2012 | 16.1 GB | Testbeds from Wireshark | This dataset was developed using a dynamic approach. Their strategy is divided into an Alpha profile and a Beta profile. The Alpha profile uses several multi-stage attack patterns to monitor the anomalous part of the dataset. On the other hand, the Beta traffic generator simulates genuine network traffic, including background noise. |
| [ | 2013 | Gideon Creech, J. Hu | ADFA IDS | 5951 records | Training and Validation type | The dataset consists of the password brute force of FTP and SSH. It also includes C100 Webshel payload, Linux Meter-preter, Java-based Meterpreter, and attack vectors with 10 attacks per vector. |
| [ | 1999 | Salvatore J. Stolfo, Wei Fan, Wenke Lee, Andreas Prodromidis, and Philip K. Chan | KDD CUP 1999 | 2 million connection records with 41 features | relational | It is commonly used as a standard dataset for IDS simulations by researchers. |
| [ | 2000 | Mahbod Tavallaee, Ebrahim Bagheri, Wei Lu, Ali A. Ghorbani | DARPA | 5000 records | relational | The 1999 DARPA Intrusion Detection Examination consisted of an off-line and a real-time intrusion detection evaluation. |
| [ | 2016 | Prudhvi Ratna Badri Satya, Kyumin Lee, Dongwon Lee, Thanh Tran, Jason (Jiasheng) Zhang | Likes of Facebook | Records including like are 13,147 | relational | A study of fake Facebook Likers obtained from company employees that use the link and honeypot approaches was done. False Likers differed from genuine Likers in terms of liking behaviors, duration, etc. |
Figure 4Methods to implement Defensive Deception.
Classification of several deception categories.
| Reference | Year | Deception Technique | Level of Interaction | Scalability | Resource Level | Goal | Main Attack | Strategy | Domain |
|---|---|---|---|---|---|---|---|---|---|
| [ | 2019 | False patch technique | High | Yes | Virtual | Property preservation | Advanced persistent threats | Incorrect facts; Fraud; imitating | Game theory |
| [ | 2015 | Honeypot, designed lure | High | Limited | Virtual | Security for assets; identification of attacks | Probing | Deceiving and imitating | Game theory |
| [ | 2019 | Honeypot | Medium | Yes | Hybrid | Safeguarding assets; identification of attacks | DoS assaults, network drops, and APTs | Deceiving; tries to imitate | IoT |
| [ | 2018 | Honey webs | Low | Yes | Virtual | Preservation of Assets | cyberattack | Deceiving; imitating | Cloud services from the internet |
| [ | 2018 | Deceiving signals | Competitive | NA | Physical | Protection of resources; monitoring of attacks | Advanced persistent threats | Misguiding; concealing; imitating; deceiving | No domain name was provided. |
| [ | 2021 | Misleading Network traffic | Dynamic/high | NA | Physical | Assets preservation | Recon/Investigating | Disguising; mirroring | Cyber–physical system |
| [ | 2016 | Social Honeypot | High | NA | Virtual | Identifying the adversary | The malevolent demeanor of a user | Imitating | A domain is not specified |
Figure 5Open Problems in CI-enabled Defensive Deception.
Figure 6Future directions in Defensive Deception.