| Literature DB >> 30823597 |
Xiaolei Liu1, Xiaojiang Du2, Xiaosong Zhang3, Qingxin Zhu4, Hao Wang5, Mohsen Guizani6.
Abstract
Many IoT (Internet of Things) systems run Android systems or Android-like systems. With the continuous development of machine learning algorithms, the learning-based Android malware detection system for IoT devices has gradually increased. However, these learning-based detection models are often vulnerable to adversarial samples. An automated testing framework is needed to help these learning-based malware detection systems for IoT devices perform security analysis. The current methods of generating adversarial samples mostly require training parameters of models and most of the methods are aimed at image data. To solve this problem, we propose a testing framework for learning-based Android malware detection systems (TLAMD) for IoT Devices. The key challenge is how to construct a suitable fitness function to generate an effective adversarial sample without affecting the features of the application. By introducing genetic algorithms and some technical improvements, our test framework can generate adversarial samples for the IoT Android application with a success rate of nearly 100% and can perform black-box testing on the system.Entities:
Keywords: Internet of Things; adversarial samples; machine learning; malware detection
Year: 2019 PMID: 30823597 PMCID: PMC6413143 DOI: 10.3390/s19040974
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Overview of our testing framework for learning-based Android Malware detection systems for IoT devices. (1) Original Sample Input; (2) Calculate the disturbance size; (3) Generate the adversarial samples; (4) Get detection result from learning-based systems; (5) Determine if the exit condition is met; (6) If not, calculate the new disturbance size using genetic algorithm; (7) If yes, output the final adversarial android application.
The environment of all experiments.
| CPU | Inter(R) Core(TM) i5-7400 CPU @ 3.00GHz |
|---|---|
|
| 8 GB |
|
| Inter(R) HD Graphics 630 |
|
| Windows 10 |
|
| Python 3.6 |
|
| Jupyter Notebook |
|
| Tensorflow, Keras, numpy etc. |
Eight features in the DREBIN dataset.
| Class | Name | Numbers | Rate (/Total) |
|---|---|---|---|
| S1 | Hardware Components | 72 | 0.013% |
| S2 | Requested Permissions | 3812 | 0.704% |
| S3 | App Components | 218,951 | 40.488% |
| S4 | Filtered Intents | 6379 | 1.178% |
| S5 | Restricted API Calls | 733 | 0.136% |
| S6 | Used Permissions | 70 | 0.013% |
| S7 | Suspicious API Calls | 315 | 0.058% |
| S8 | Network Address | 310,447 | 57.4% |
Figure 2The sorting result of feature importance. The ordinate represents different behavioral feature categories and the abscissa represents the proportion of importance.
The detection results of five models.
| Models | TP | FP | FN | TN | Accuracy | Precision | Recall |
|---|---|---|---|---|---|---|---|
| NN (Neural Network) | 40770 | 0 | 74 | 1726 | 99.83% | 1 | 95.95% |
| LR (Logistic Regression) | 40770 | 0 | 234 | 1566 | 99.45% | 1 | 96.32% |
| DT (Decision Tree) | 40770 | 0 | 60 | 1740 | 99.86% | 1 | 95.91% |
| RF (Random Forest) | 40770 | 0 | 32 | 1768 | 99.92% | 1 | 95.85% |
| ET (Extreme Tree) | 40770 | 0 | 16 | 1784 | 99.96% | 1 | 95.81% |
TP = True Positive, FP = False Positive, FN = False Negative, TN = True Negative.
The parameters of our approach.
| Features | S1: Hardware Components | S2: Requested Permissions |
|---|---|---|
| Initialize Probability | 1% | 0.01% |
| Mutation Probability | 30% | 0.5% |
| Iterations | 50 | 50 |
| Population | 150 | 150 |
| Attacked Samples | 1000 | 1000 |
The results of our approach.
| Model | Category | Success Rate | Average of |
|---|---|---|---|
| NN | S1 | 1 | 2.25 |
| S2 | 1 | 2.33 | |
| LR | S1 | 0.998 | 2.66 |
| S2 | 0.995 | 1.94 | |
| DT | S1 | 0.896 | 1.05 |
| S2 | 0.992 | 1.68 | |
| RF | S1 | 0.866 | 2.89 |
| S2 | 0.995 | 9.54 | |
| ET | S1 | 0.833 | 2.81 |
| S2 | 0.945 | 9.36 |
Each line of data in the table is the average of the 1000 sample tests results.
Figure 3The most frequently added permissions in our adversarial sample generation experiments. The data is the average of 5 × 2 × 1000 samples test results.
Figure 4Trend graph of fitness function values with number of iterations.
Figure 5The fitness function values of adversarial samples for five detection models with S1 permission features.
Figure 6The fitness function values of adversarial samples for five detection models with S2 permission features.