| Literature DB >> 35052095 |
Qiaozheng Wang1, Xiuguo Zhang1, Xuejie Wang1, Zhiying Cao1.
Abstract
The log messages generated in the system reflect the state of the system at all times. The realization of autonomous detection of abnormalities in log messages can help operators find abnormalities in time and provide a basis for analyzing the causes of abnormalities. First, this paper proposes a log sequence anomaly detection method based on contrastive adversarial training and dual feature extraction. This method uses BERT (Bidirectional Encoder Representations from Transformers) and VAE (Variational Auto-Encoder) to extract the semantic features and statistical features of the log sequence, respectively, and the dual features are combined to perform anomaly detection on the log sequence, with a novel contrastive adversarial training method also used to train the model. In addition, this paper introduces the method of obtaining statistical features of log sequence and the method of combining semantic features with statistical features. Furthermore, the specific process of contrastive adversarial training is described. Finally, an experimental comparison is carried out, and the experimental results show that the method in this paper is better than the contrasted log sequence anomaly detection method.Entities:
Keywords: BERT; VAE; adversarial training; contrastive learning; statistical features
Year: 2021 PMID: 35052095 PMCID: PMC8774910 DOI: 10.3390/e24010069
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Method comparison table.
| Comparison | Input Value | Model or Algorithm | Strategy to Deal with Unseen Logs |
|---|---|---|---|
| SVM [ | event count vector | construct a hyperplane | unable to deal with unseen logs |
| IM [ | event count vector | singular value decomposition, | unable to deal with unseen logs |
| PCA [ | event count vector | construct normal and abnormal subspaces | unable to deal with unseen logs |
| DeepLog [ | logkey, | LSTM | unable to deal with unseen logs |
| CNN [ | logkey | CNN | unable to deal with unseen logs |
| LogRobust [ | semantic vector | Bi-LSTM with Attention | semantic vector conversion, attention mechanism |
| CATLog | semantic vector, statistical vector | BERT, VAE | semantic vector conversion, contrastive adversarial training |
Figure 1The method flow of CATLog.
Figure 2Information contained in log entries.
Figure 3Example of log parsing process.
Figure 4The BERT model with the multi-head self-attention network as the embedding layer.
Figure 5Anomaly detection model.
Figure 6The specific process of contrastive adversarial training.
The impact of the size of on the accuracy of anomaly detection (HDFS dataset).
| 0 | 0.1 | 0.2 | 0.3 | 0.4 | 0.5 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Round | |||||||||||||
| First round | 0.985 | 0.985 | 0.985 | 0.985 | 0.986 | 0.986 | 0.988 | 0.988 | 0.987 | 0.987 | 0.987 | 0.987 | |
| Second round | 0.984 | 0.986 | 0.987 | 0.989 | 0.988 | 0.987 | |||||||
| Third round | 0.984 | 0.986 | 0.987 | 0.988 | 0.987 | 0.988 | |||||||
| Fourth round | 0.986 | 0.985 | 0.986 | 0.987 | 0.987 | 0.987 | |||||||
| Fifth round | 0.985 | 0.985 | 0.985 | 0.988 | 0.987 | 0.988 | |||||||
The impact of the size of on the accuracy of anomaly detection (BGL dataset).
| 0 | 0.1 | 0.2 | 0.3 | 0.4 | 0.5 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Round | |||||||||||||
| First round | 0.988 | 0.987 | 0.989 | 0.988 | 0.990 | 0.991 | 0.989 | 0.990 | 0.987 | 0.988 | 0.987 | 0.989 | |
| Second round | 0.987 | 0.988 | 0.991 | 0.990 | 0.989 | 0.988 | |||||||
| Third round | 0.987 | 0.987 | 0.992 | 0.990 | 0.989 | 0.990 | |||||||
| Fourth round | 0.987 | 0.988 | 0.990 | 0.989 | 0.988 | 0.990 | |||||||
| Fifth round | 0.988 | 0.988 | 0.990 | 0.991 | 0.988 | 0.989 | |||||||
Figure 7(a) Comparison of accuracy of different methods on HDFS; (b) Comparison of accuracy of different methods on BGL; (c) Comparison of recall of different methods on HDFS; (d) Comparison of recall of different methods on BGL; (e) F1-score comparison of different methods on HDFS; (f) F1-score comparison of different methods on BGL.
The ordinal values of different models.
| Datasets | SVM | DeepLog | LogRobust | CATLog(unCAT) | CATLog |
|---|---|---|---|---|---|
| HDFS | 5 | 4 | 3 | 2 | 1 |
| BGL | 5 | 4 | 2.5 | 2.5 | 1 |
| The average ordinal values | 5 | 4 | 2.75 | 2.25 | 1 |
Figure 8Update of log template.
Figure 9Comparison of robustness.