| Literature DB >> 36080965 |
Kangyi Ding1,2, Teng Hu2, Weina Niu1, Xiaolei Liu2, Junpeng He1, Mingyong Yin2, Xiaosong Zhang1.
Abstract
The Internet has become the main channel of information communication, which contains a large amount of secret information. Although network communication provides a convenient channel for human communication, there is also a risk of information leakage. Traditional image steganography algorithms use manually crafted steganographic algorithms or custom models for steganography, while our approach uses ordinary OCR models for information embedding and extraction. Even if our OCR models for steganography are intercepted, it is difficult to find their relevance to steganography. We propose a novel steganography method for character-level text images based on adversarial attacks. We exploit the complexity and uniqueness of neural network boundaries and use neural networks as a tool for information embedding and extraction. We use an adversarial attack to embed the steganographic information into the character region of the image. To avoid detection by other OCR models, we optimize the generation of the adversarial samples and use a verification model to filter the generated steganographic images, which, in turn, ensures that the embedded information can only be recognized by our local model. The decoupling experiments show that the strategies we adopt to weaken the transferability can reduce the possibility of other OCR models recognizing the embedded information while ensuring the success rate of information embedding. Meanwhile, the perturbations we add to embed the information are acceptable. Finally, we explored the impact of different parameters on the algorithm with the potential of our steganography algorithm through parameter selection experiments. We also verify the effectiveness of our validation model to select the best steganographic images. The experiments show that our algorithm can achieve a 100% information embedding rate and more than 95% steganography success rate under the set condition of 3 samples per group. In addition, our embedded information can be hardly detected by other OCR models.Entities:
Keywords: OCR models; adversarial attack; steganography; transferability
Mesh:
Year: 2022 PMID: 36080965 PMCID: PMC9460549 DOI: 10.3390/s22176497
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Figure 1The schematic diagram of our steganography method. We achieve steganography by attacking the local model. We achieve steganographic message embedding by attacking the local model, and information extraction by capturing the character whose result recognized by the local model is different from the reference model.
Figure 2Flowchart of the generation of adversarial samples. After adding almost invisible noise to the image of a pretzel, the AI model adjudicates with a high degree of confidence as a king crab.
Figure 3The framework of the character-level text image steganography method based on adversarial attacks. The red boxes are the characters we selected for modification. We have embedded the character squeeze in the character harm.
Figure 4Statistical distribution of success rate of information extraction success rate.
Figure 5Statistical distribution of the probability of no impact on testing models’ results.
Average of the evaluation metrics we used.
| Method | Extraction | No Impact | Correct Recognition | b | |
|---|---|---|---|---|---|
| Ours | 96.9% | 82.8% | 95.7% | 80.23% | 0.2137 |
| Without ① | 90.8% | 55.6% | 72.3% | 50.48% | 0.3239 |
| Without ① and ② | 68.2% | 77.3% | 89.2% | 52.72% | 0.2760 |
| Without ② | 97.6% | 76.4% | 88.3% | 74.57% | 0.2818 |
Figure 6The steganographic image generated by our method. The semantics of the selected character in the original sample is completely changed.
Statistics of the variation of extraction success rate, the correct recognition rate of the testing model, probability of no effect on the testing model, steganography success rate, distortion with regular term coefficients.
| Regular Term | Extraction | No Impact on | Correct Recognition | Steganography | |
|---|---|---|---|---|---|
| 10 | 99.8% | 80.2% | 92.1% | 80.04% | 0.2369 |
| 20 | 99.3% | 81.4% | 94.5% | 80.83% | 0.2295 |
| 30 | 98.2% | 81.5% | 94.4% | 80.03% | 0.2243 |
| 40 | 96.9% | 82.8% | 95.7% | 80.23% | 0.2223 |
| 50 | 95.8% | 82.3% | 93.6% | 78.84% | 0.2187 |
| 60 | 94.9% | 81.6% | 92.8% | 77.44% | 0.2159 |
Statistics of the variation of steganographic success rate, correct recognition rate of the testing model, probability of no impact on the testing model, correct information extraction rate, and distortion with different number of samples per group.
| Numbers of | Extraction | No Impact on | Correct Recognition | Steganography | |
|---|---|---|---|---|---|
| 1 | 96.9% | 82.8% | 95.7% | 80.23% | 0.2223 |
| 2 | 99.20% | 94.30% | 98.60% | 93.55% | 0.2187 |
| 3 | 100% | 95.90% | 98.8% | 95.90% | 0.2254 |
| 4 | 100% | 96.60% | 98.90% | 96.60% | 0.2179 |
| 5 | 100% | 97.20% | 98.80% | 97.20% | 0.2214 |
Confusion matrix of embedded and extracted information when only one sample per group.
| Extracted Information | |||
|---|---|---|---|
|
|
|
|
|
| positive | 957 | 4 | 39 |
| negative | 0 | 0 | 9000 |
Confusion matrix of SRNet’s detection results.
| Detection Results | ||
|---|---|---|
|
|
|
|
| postive | 0 | 1000 |
| negative | 0 | 1000 |
Effects of different compression qualities on steganographic information extraction.
| q | 100 | 90 | 80 | 70 | 60 |
| extraction success rate | 100% | 73.7% | 59.7% | 48.4% | 44.8% |
| local model recognition rate on others | 100% | 99.98% | 99.95% | 99.87% | 99.62% |
| reference model recognition rate | 100% | 99.99% | 99.94% | 99.88% | 99.63% |
where q denotes the JPEG compression quality.