| Literature DB >> 32929615 |
Enrique Garcia-Ceja1, Brice Morin2, Anton Aguilar-Rivera3, Michael Alexander Riegler4.
Abstract
In this work, we propose the use of a genetic-algorithm-based attack against machine learning classifiers with the aim of 'stealing' users' biometric actigraphy profiles from health related sensor data. The target classification model uses daily actigraphy patterns for user identification. The biometric profiles are modeled as what we call impersonator examples which are generated based solely on the predictions' confidence score by repeatedly querying the target classifier. We conducted experiments in a black-box setting on a public dataset that contains actigraphy profiles from 55 individuals. The data consists of daily motion patterns recorded with an actigraphy device. These patterns can be used as biometric profiles to identify each individual. Our attack was able to generate examples capable of impersonating a target user with a success rate of 94.5%. Furthermore, we found that the impersonator examples have high transferability to other classifiers trained with the same training set. We also show that the generated biometric profiles have a close resemblance to the ground truth profiles which can lead to sensitive data exposure, like revealing the time of the day an individual wakes-up and goes to bed.Entities:
Keywords: Biometric profiles; Genetic algorithms; Impersonator attack; Machine learning
Mesh:
Year: 2020 PMID: 32929615 PMCID: PMC7497442 DOI: 10.1007/s10916-020-01646-y
Source DB: PubMed Journal: J Med Syst ISSN: 0148-5598 Impact factor: 4.460
Classifier performance results
| Metric | Value |
|---|---|
| Accuracy | 0.63 |
| True Positive Rate (TPR) | 0.63 |
| True Negative Rate (TNR) | 0.99 |
| Balanced accuracy | 0.81 |
Fig. 1Correlation between users’ activity levels
Fig. 2Steps to generate impersonator examples. 1. Start with an initial random population, 4 in this case. 2. Query the target model using the initial random population. 3. Obtain the confidence scores from the target model for each individual. 4. Perform selection, crossover and mutation operations on the individuals based on the confidence scores to generate an improved population. 5. Query the target model with the improved population and repeat steps 3-5 until there is no improvement or the max number of allowed iterations has been reached. The final impersonator example is the individual with the highest fitness value
Fig. 3Experiment workflow. 1. The biometric profiles are used to train the target model and additional models that will be used to test transferability. 2. The proposed method is used to attack the target model and generate impersonator examples. 3. The impersonator examples are used to impersonate the users on both, the target model and the models used to test transferability
Results for the three genetic attacks with incremental strength
| population size | avg. generations | avg. sec. | SR | |
|---|---|---|---|---|
| Weak attack | 10 | 841 | 30 | 76.3% |
| Medium attack | 20 | 922 | 53 | 91.0% |
| Strong attack | 50 | 1019 | 116 | 94.5% |
Fig. 4Biometric profile of the initial random solution, ground truth and resulting impersonator example for the target user control10
Fig. 5Heatmap of the initial random solution, ground truth and resulting impersonator example for the target user control10. h1...h24 are the hours of the day
Transferability success rates
| With confidence threshold | No threshold | |
|---|---|---|
| Naive Bayes | 61.5% | 61.5% |
| KNN, k = 1,3,5 | 96.0%,73.0%,69.0% | 96.0%,73.0%,69.0% |
| SVM | 2.0% | 33.0% |
| Neural Network | 11.5% | 23.0% |
| LDA | 57.7% | 64.0% |
Results for the strong genetic attack when the target model omits confidence scores
| Population size | avg. generations | avg. sec. | SR | |
|---|---|---|---|---|
| Strong attack | 50 | 1500 | 302 | 1.8% |
Fig. 6Ground truth biometric profiles of failed and successful impersonated users. (For a colored version of this plot please see the online version)