| Literature DB >> 36015776 |
Islam Md Shafiqul1, Mir Kanon Ara Jannat1, Jin-Woo Kim1, Soo-Wook Lee2, Sung-Hyun Yang1.
Abstract
Nowadays WiFi based human activity recognition (WiFi-HAR) has gained much attraction in an indoor environment due to its various benefits, including privacy and security, device free sensing, and cost-effectiveness. Recognition of human-human interactions (HHIs) using channel state information (CSI) signals is still challenging. Although some deep learning (DL) based architectures have been proposed in this regard, most of them suffer from limited recognition accuracy and are unable to support low computation resource devices due to having a large number of model parameters. To address these issues, we propose a dynamic method using a lightweight DL model (HHI-AttentionNet) to automatically recognize HHIs, which significantly reduces the parameters with increased recognition accuracy. In addition, we present an Antenna-Frame-Subcarrier Attention Mechanism (AFSAM) in our model that enhances the representational capability to recognize HHIs correctly. As a result, the HHI-AttentionNet model focuses on the most significant features, ignoring the irrelevant features, and reduces the impact of the complexity on the CSI signal. We evaluated the performance of the proposed HHI-AttentionNet model on a publicly available CSI-based HHI dataset collected from 40 individual pairs of subjects who performed 13 different HHIs. Its performance is also compared with other existing methods. These proved that the HHI-AttentionNet is the best model providing an average accuracy, F1 score, Cohen's Kappa, and Matthews correlation coefficient of 95.47%, 95.45%, 0.951%, and 0.950%, respectively, for recognition of 13 HHIs. It outperforms the best existing model's accuracy by more than 4%.Entities:
Keywords: antenna-frame-subcarrier attention mechanism (AFSAM); channel state information (CSI); deep learning (DL); human activity recognition (HAR); human-human interactions (HHIs)
Mesh:
Year: 2022 PMID: 36015776 PMCID: PMC9414797 DOI: 10.3390/s22166018
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Details of the CSI-based HHI dataset.
| Interaction | Label | No. of Samples | Interaction | Label | No. of Samples |
|---|---|---|---|---|---|
| Approaching | I1 | 3359 | Pointing with the left hand | I8 | 4067 |
| Departing | I2 | 3115 | Pointing with the right hand | I9 | 4081 |
| Handshaking | I3 | 3606 | Punching with the left hand | I10 | 2497 |
| High five | I4 | 3643 | Punching with the right hand | I11 | 2500 |
| Hugging | I5 | 2480 | Pushing | I12 | 3610 |
| Kicking with the left leg | I6 | 2471 | Steady state | I13 | 22,792 |
| Kicking with the right leg | I7 | 2489 |
Figure 1Block diagram of methodological steps to recognize HHI.
Figure 2Raw and smoothing CSI signal visualization of some interactions, i.e., (a) Approaching, (b) Departing, (c) Handshaking, (d) High five, (e) Hugging, (f) Steady state.
Figure 3The architecture of our proposed model (HHI-AttentionNet).
Summary of the HHI-AttentionNet model.
| Section | Layer Type | Output Shape | Parameters |
|---|---|---|---|
|
| Conv 2D | 256 × 15 × 32 | 1760 |
| BN and ReLU | 256 × 15 × 32 | 128 | |
| 128 × 8 × 64 | 2816 | ||
| AFSAM | 128 × 8 × 64 | 4145 | |
| 64 × 4 × 128 | 9728 | ||
| 32 × 2 × 128 | 18,816 | ||
| AFSAM | 32 × 2 × 128 | 16,433 | |
| 16 × 1 × 256 | 35,840 | ||
| 8 × 1 × 256 | 70,400 | ||
| Recognition | GAP | 1 × 256 | 0 |
| Dropout (0.20) | 1 × 256 | 0 | |
| Dense | 1 × 64 | 16,448 | |
| Softmax | 1 × 13 | 845 |
Performance result of the proposed model on the CSI-based HHI dataset with 10-fold CV. All results are in percentages (%).
| Number | Metrics (%) | Fold | Average | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1st | 2nd | 3rd | 4th | 5th | 6th | 7th | 8th | 9th | 10th | |||
| 12 |
| 94.60 | 94.74 | 95.00 | 94.85 | 94.44 | 94.60 | 94.56 | 94.26 | 94.23 | 95.04 | 94.55 ± 0.25 |
| 94.56 | 94.70 | 94.95 | 94.75 | 94.42 | 94.56 | 94.52 | 94.15 | 94.20 | 94.81 | 94.50 ± 0.24 | ||
| k-score | 0.945 | 0.947 | 0.948 | 0.946 | 0.944 | 0.945 | 0.945 | 0.941 | 0.942 | 0.948 | 0.945 ± 0.22 | |
|
| 0.944 | 0.946 | 0.948 | 0.945 | 0.943 | 0.944 | 0.954 | 0.941 | 0.941 | 0.947 | 0.945 ± 0.38 | |
| 13 |
| 95.44 | 95.58 | 95.23 | 95.51 | 95.23 | 95.77 | 95.53 | 95.67 | 95.18 | 95.60 | 95.47 ± 0.19 |
| 95.41 | 95.56 | 95.21 | 95.49 | 95.22 | 95.74 | 95.51 | 95.66 | 95.16 | 95.55 | 95.45 ± 0.19 | ||
| k-score | 0.950 | 0.951 | 0.948 | 0.951 | 0.947 | 0.954 | 0.951 | 0.953 | 0.947 | 0.953 | 0.951 ± 0.20 | |
|
| 0.950 | 0.951 | 0.948 | 0.951 | 0.948 | 0.953 | 0.951 | 0.952 | 0.946 | 0.952 | 0.950 ± 0.20 | |
Figure 4The confusion matrix of the HHI-AttentionNet model for HHIs recognition.
Parameters and times of the proposed HHI-AttentionNet model.
| Model | No. of Class | No. of Parameter | Time (s) | |||
|---|---|---|---|---|---|---|
| Trainable | Non-Trainable | Total | Training | Recognition | ||
| HHI-AttentionNet | 12 | 173,406 | 2944 | 176,350 | 1615 ± 1.9 | 0.000198 ± 0.000012 |
| 13 | 173,551 | 2944 | 176,495 | 3000 ± 1.4 | 0.000200 ± 0.000014 | |
Figure 5(a) Accuracy graph for training and testing; (b) Loss graph of the proposed method for training and testing.
Figure 6T-SNE visualization of test data before (a) and after (b) the proposed model learning representations.
Performance comparison of the proposed method with the existing methods on the CSI HHI dataset. Boldface denotes the highest performance, (-) denotes non-available information.
| Study | Methodology and Year | Metrics (%) | Trainable | Recognition Time(s) | |||
|---|---|---|---|---|---|---|---|
|
| F1 | k-Score |
| ||||
| Alazrai et al. [ | SVM (2021) | 69.79 | - | - | - | - | - |
| Alazrai et al. [ | E2EDLF (2020) | 86.30 | 86.00 | 85.00 | - | 935,053 | 0.00022 ± 0.000018 |
| Kabir et al. [ | CSI-IANet (2021) | 91.30 | 91.27 | 89.42 | - | 546,321 | 0.00036 ± 0.000025 |
|
|
|
|
|
|
|
| |