Literature DB >> 34069503

Securing the Insecure: A First-Line-of-Defense for Body-Centric Nanoscale Communication Systems Operating in THz Band.

Waqas Aman^1,2, Muhammad Mahboob Ur Rahman¹, Hasan T Abbas², Muhammad Arslan Khalid³, Muhammad A Imran^2,4, Akram Alomainy⁵, Qammer H Abbasi².

Abstract

This manuscript presents a novel mechanism (at the physical layer) for authentication and transmitter identification in a body-centric nanoscale communication system operating in the terahertz (THz) band. The unique characteristics of the propagation medium in the THz band renders the existing techniques (say for impersonation detection in cellular networks) not applicable. In this work, we considered a body-centric network with multiple on-body nano-senor nodes (of which some nano-sensors have been compromised) who communicate their sensed data to a nearby gateway node. We proposed to protect the transmissions on the link between the legitimate nano-sensor nodes and the gateway by exploiting the path loss of the THz propagation medium as the fingerprint/feature of the sender node to carry out authentication at the gateway. Specifically, we proposed a two-step hypothesis testing mechanism at the gateway to counter the impersonation (false data injection) attacks by malicious nano-sensors. To this end, we computed the path loss of the THz link under consideration using the high-resolution transmission molecular absorption (HITRAN) database. Furthermore, to refine the outcome of the two-step hypothesis testing device, we modeled the impersonation attack detection problem as a hidden Markov model (HMM), which was then solved by the classical Viterbi algorithm. As a bye-product of the authentication problem, we performed transmitter identification (when the two-step hypothesis testing device decides no impersonation) using (i) the maximum likelihood (ML) method and (ii) the Gaussian mixture model (GMM), whose parameters are learned via the expectation-maximization algorithm. Our simulation results showed that the two error probabilities (missed detection and false alarm) were decreasing functions of the signal-to-noise ratio (SNR). Specifically, at an SNR of 10 dB with a pre-specified false alarm rate of 0.2, the probability of correct detection was almost one. We further noticed that the HMM method outperformed the two-step hypothesis testing method at low SNRs (e.g., a 10% increase in accuracy was recorded at SNR = -5 dB), as expected. Finally, it was observed that the GMM method was useful when the ground truths (the true path loss values for all the legitimate THz links) were noisy.

Entities: CellLine Chemical Disease Gene Species

Keywords: authentication; body-centric sensor networks; healthcare systems; nano sensors; nanoscale communication; outlier detection; security; sensor networks; terahertz communication

Mesh：

Year: 2021 PMID： 34069503 PMCID： PMC8160706 DOI： 10.3390/s21103534

Source DB: PubMed Journal: Sensors (Basel) ISSN： 1424-8220 Impact factor: 3.576

1. Introduction

Nanoscale communication systems have attracted researchers due to their promising applications in healthcare, manufacturing industries, environmental control, etc. [1]. On the other hand, body-centric communication has potential applications in healthcare, entertainment, etc. [2]. Generally, body-centric communication is classified as “off”-, “on”-, and “in”-body communication based on the communication among implanted or wearable electronic devices. In this work, we focused on the body-centric communication systems where nano sensors/devices operating in the THz band are deployed on the body of a human being. Due to the small size of nano devices, the existing frameworks, techniques, and methods proposed for communication networks such as WiFi, 4G, etc., are not suitable for exchanging information amongst the nano devices [3]. For instance, nano devices are unable to operate at microwave bands due to their small size. They would require molecular communication and the terahertz (THz) band for operation. Additionally, in IoT devices, due to the small energy sources, the computational processing capability is limited. Therefore, it is necessary to meet the requirements for new protocols of nano devices at all layers of the protocol stack. Operating in the THz band (0.1–10 THz) is a promising solution at the physical layer (PL) [4], which makes the antenna size very small and thus suitable for exchanging information between nano devices. Like other communication networks, the body-centric nanoscale communication networks are also prone to a wide range of active and passive attacks by adversaries [5]. Some of the common attacks include eavesdropping, impersonation, denial of service (DoS), etc. Here, we investigated an impersonation attack in body-centric nanoscale communication networks. Figure 1 shows an illustration of an impersonation attack on a smart healthcare system scenario. The nano nodes are deployed on the body of a person/patient for disease diagnostics or to remotely monitor his/her health parameters. These nano devices are connected to a wearable device, which communicates the data to an outdoor network via a nano-to-micro interface. Assuming an enemy of the person secretly deployed its own nano machines nearby with the aim of impersonating the person’s legitimate nodes to report false measurements to the remote health unit, an incorrect response through the nano machines or nearby doctors could result in devastating consequences. Therefore, we need an authentication mechanism at the nano-to-micro interface device (wearable device) to allow data transmission (reported measurements, i.e., glucose, blood pressure, etc., of nano nodes/sensors) from legitimate nano nodes only, blocking all malicious nodes.

Figure 1

An envisioned future body-centric nanoscale healthcare system with possible malicious nodes.

In traditional communication systems, the countermeasures for such attacks are performed at the higher layer using cryptography. Despite the wide work in the field of cryptography, the mechanism can be compromised because of its sole dependency on the predefined shared secret among the legitimate users. With recent advances in quantum computing, traditional encryption has become vulnerable to being easily decoded, and existing crypto-based measures are not quantum secure unless the size of secret keys increases to impractical lengths [6]. In this regard, physical layer (PL) security finds itself as a promising mechanism in future communication systems. PL security exploits the random nature of the physical medium/layer for security purposes [7]. Authentication is one of the pillars required for the security of any communication system. PL authentication is a systematic procedure that uses PL’s features to provide authentication. In conventional systems, asymmetric key encryption (AKE) is typically used in the authentication phase, which is the realm of public key encryption (a crypto-based approach). Such schemes are quantum insecure and incur overhead or high computations, which not only increase the size of the device, but also consume much power. The devices fabricated for nanoscale communication are energy constrained as they incorporate a small source of energy (a battery). PL authentication has a low overhead (a simple procedure that typically includes feature estimation and testing) and is almost impossible to clone unless the devices lie on each other. Various fingerprints including RSS [8], CIR [9,10], CFR [11,12], carrier frequency offset [13], and I/Q imbalance [14] have been reported for PL authentication in conventional communication systems. Related Work: The authors in [15] for the first time studied authentication using path loss (S21 parameter) in body-centric communication using millimeter waves. Regarding the security of systems operating in the THz band, we found some works [5,16,17,18] in the literature. The work [16] provided the first study on the security challenges faced by nanoscale communication systems, while the work [17] presented some possible promising applications along with the security challenges in the Internet of Nano-Things. Further, the experimental work of Jianjun et al. [5] for the first time rejected the claim about security in the THz band. The claim was that the inherit narrow beamwidth of the THz link makes it secure and thus impossible for a malicious node to accomplish an eavesdropping attack. The authors in [5] in their experiments used reflectors of different shapes between the THz transmitter and receiver. Then, with the help of secrecy capacity and blockage as performance metrics, they clearly demonstrated that eavesdropping attacks in the THz band can be easily performed. The differences between our work and previous work are as follows: The first work [15], which studied the authentication problem in body-centric communication systems, considered millimeter-wave communication with a three-node setup. In contrast, our work considered multiple legitimate and malicious nodes operating in the THz band. The work [5] considered an eavesdropping attack in a system operating in the THz band, which was a different problem/attack than the attack we considered in our work. Next, in our previous work [18], we studied PL authentication for an in vivo nanoscale communication system whereby we utilized the path loss as the device fingerprint for a three-node system (i.e., Alice, Eve, and Bob). The difference between our previous work [18] and this work was twofold. First, the previous work was limited to the three-node system only, while in this work, the system model was comprised of multiple legitimate and malicious nodes. Second, the previous work was for an in vivo nanoscale communication system where authentication occurs at a nano node (Bob). Contributions: For the first time, this work studied authentication at a nano-to-micro interface device (wearable device) in an on-body-centric communication system where we exploited the high-resolution transmission molecular absorption (HITRAN) database [19] for computing the path loss. For the first time, impersonation attack detection at the wearable device/receiver/Bob in multiple legitimate and malicious nano nodes operating in the THz band is performed via different mechanisms. We performed authentication by two-step hypothesis testing. We refined the output of the hypothesis testing via the hidden Markov model (HMM) with the Viterbi algorithm. We also performed transmitter identification via the maximum likelihood and Gaussian mixture model (GMM) with the expectation–maximization algorithm. Outline: The rest of this paper is organized as follows. Section 2 provides the system model. Section 3 discusses authentication via two-step hypothesis testing. Section 4 presents the hidden Markov model to refine the output of hypothesis testing. Section 5 provides transmitter identification schemes. Section 6 presents simulation results with discussions, and Section 7 concludes the paper.

2. System Model

For the purposes of the simulation, we considered a square 2D map/layout of size (1 m × 1 m) where nano transmission (Tx) nodes, M Alice (legitimate) nodes , and N Eve (malicious) nodes are deployed according to the uniform distribution model, whilst a nano-to-micro interface device/receiver node, Bob, is placed at the origin, as shown in Figure 2. We assumed that the Tx nodes transmitted with a fixed/pre-specified transmit power so that the path loss can be computed by Bob.

Figure 2

System model for the simulation purposes: Bob is placed at the origin. Alice’s and Eve’s node locations are modeled as uniformly distributed random variables. In this case, and .

The path loss is given as [20,21]: where f is the frequency, d is the distance, is the absorption loss, and is the spreading loss. More details of spreading and absorption losses are given in Appendix A. In the next section, we discuss the two-step mechanism for impersonation detection.

3. Authentication via Two-Step Hypothesis Testing

We assumed that the shared channel is time-slotted, whilst the transmit nodes perform channel sensing before transmitting; hence, there are no collisions. Without loss of generality, it can be assumed that is the legitimate node for slot k, but if does not transmit during this time slot, could transmit to Bob pretending to be an Alice node. Therefore, Bob needs to authenticate each message received on the shared channel and verify the transmitter identity (if no impersonation has been declared) in a systematic manner. Assume that the noisy measurement has been obtained at time k (for instance, by using the pulse-based method as discussed in [22]), where and L is the path loss. Furthermore, in line with previous studies [18,23], we assumed that Bob has already learned the ground truth via prior training on a secure channel. The ground truth vector can be denoted by . The two-step hypothesis testing or maximum likelihood (ML) hypothesis test can be explained by the following equations: Next, the binary hypothesis test works as follows: Equivalently, we have: where is a small threshold—a design parameter. This work followed the Neyman–Pearson theorem [24], which states that, for a pre-specified , can be chosen such that is minimized. The error probabilities for the above hypothesis tests are: where is the complementary cumulative distribution function (ccdf) of a standard normal distribution, and is the prior probability of . Thus, the threshold could be computed as follows: Then, is given as: where is the prior probability of . is the fraction of slots that were originally dedicated to , but were found idle and thus utilized by . Since is an R.V., the expected value is as follows: where we assumed that the unknown path loss and . Next, we discuss the HMM for refining the outcomes/results of the two-step hypothesis testing.

4. Hidden Markov Model-Based Approach

To refine the output of the two-step hypothesis testing, we used the HMM-based approach. More specifically, at a given time instant k, the system is in one of the two states with the state-space: . The states and imply that there is no impersonation, impersonation respectively, at time k. However, the true state of the system is hidden; therefore, what we observe through the hypothesis test is another observable Markov chain. The connection between the true/hidden state and the observable state is given by the emission probability matrix: where , . The off-diagonal elements in the i-th row of represents the errors made by the ML test, i.e., deciding the state as , while the system was actually in state . The transition from state i to state j occurs after a fixed interval of seconds where is the measurement rate. Assume that the system was in state at time , i.e., , we are in time and want to predict the probability vector at time k, and the system is in state , . To this end, we have the following transition probability matrix: where , . Then, we have the following relation: . Alternatively, we can write: .

ML Estimation of a Hidden Markov Sequence Using the Viterbi Algorithm

The Viterbi algorithm is used for the ML sequence estimation (MLSE) of , given as: At this stage, we are done with impersonation detection mechanisms. Next, we discuss the transmitter identification mechanisms.

5. Transmitter Identification

The transmitter identification is accomplished via two approaches: ML- and GMM-based transmitter identification.

5.1. ML-Based Approach

In the ML-based approach, the probability of the misclassification error resulting from Equation (2) is given as: where . For the hypothesis test of (4), is given as: where , . Additionally, where the sort operation (.) sorts a vector in increasing order. For the boundary cases, e.g., , , , respectively.

5.2. Transmitter Identification Using Gaussian Mixture Modeling

The GMM consisted of component densities where only the densities could be trained. The GMM parameter was learned by running the expectation–maximization (EM) algorithm on the training data. The GMM, in its standard form, is perfectly suited for transmitter identification. Under the GMM, the probability density function (pdf) of the (observed) mixture random variable X is the convex/weighted sum of the component pdfs: where each is a Gaussian pdf that satisfies: , . The weights/priors satisfy: , . The GMM has unknown parameters, which were learned by applying the iterative expectation–maximization algorithm on the training data . The posterior probability for each point in the training data (i.e., the likelihood of belonging to component q of the mixture) was computed as follows (j is the iteration number): The Q number of priors were updated as follows: The Q number of means were updated as follows: The Q number of (co-)variances were updated as follows: The iterative EM algorithm monotonically increased the objective (likelihood) function value and converged when the increase in the likelihood function value between two successive iterations became less than the threshold . Figure 3 shows a flow graph of the proposed methodology. The noisy estimated measurement/path loss at slot k was fed to a two-step mechanism for impersonation detection, and the HMM was used to refine the outcomes of the two-step mechanism with the help of transition and emission probability matrices (i.e., and ) and the Viterbi algorithm. Transmitter identification was done via the ML and GMM approaches when no impersonation was decided.

Figure 3

Proposed methodology for impersonation detection and transmitter identification in body-centric nanoscale communication systems operating in the THz band.

6. Simulations

6.1. Setup

We kept , , THz, k, and atm. Both the Alice and Eve nodes were deployed according to the uniform distribution in a 1 m × 1 m area. A total of random realizations (independent of the Alice and Eve nodes) of the nodes’ deployment were taken, and then, errors were averaged over the realizations. and are two well-known probabilities resulting in hypothesis testing. was defined as the probability that any i-th Alice node can be considered as any of the Eve nodes is the probability of the event that any j-th Eve node can be considered as any of the Alice nodes.

6.2. Results

Figure 4 represents the two probabilities against SNR where the improvement in error probabilities with an increasing SNR can be seen clearly. The designed parameter decreased , but increased .

Figure 4

The error probabilities against SNR = with different values for threshold . (a) Probability of false alarm. (b) Probability of missed detection. Both probabilities are decreasing functions of SNR.

Figure 5 shows the efficacy of the HMM. At a low SNR, the performance of the HMM was far better than HT, and at a high SNR, HT was close to the HMM. The results were produced after the Monte Carlo-based simulation. The total number of transmissions was kept to (more specifically, binary states (, ) were generated), , , where I is the identity matrix and . The errors resulting from the HT and HMM methods were calculated as the number of times the predicted/estimated state was not equal to the actual state divided by the total transmissions. The accuracy was then computed accordingly. The entries of were calculated according to and . Figure 6 shows the receiver operating characteristic (ROC) curves for different configurations of the nodes and transmissions from Eve nodes (i.e., ). Typically, the ROC contains two error probabilities ( and ), but due to multiple nodes in this study, we had three probabilities. For any value, was constant, which is obvious from Equation (13). Increasing the SNR not only improved , but also improved as well. was chosen as an independent variable and swept in the range from zero to one. Using Equation (6), the threshold was calculated for a given SNR value. Further, (the detection probability) and were computed as the average after doing uniform realizations of the nodes’ deployment. We observed that increasing the number of nodes did not affect , but increased with an increase in the number of Alice nodes (M). We further observed that when fewer nodes (Alice nodes) remained idle during their allocated slot, the more we had.

Figure 5

Performance comparison of two-step hypothesis testing and the hidden Markov model (HMM) with Viterbi algorithm. (a) Histogram comparison for a highly degraded channel, i.e., SNR = dB. (b) Histogram comparison for a moderately degraded channel, i.e., SNR = 0 dB. Performances of both approaches get closer and closer when SNR increases.

Figure 6

Receiver operating characteristic (ROC) curves: Three probabilities (false alarm, detection, and misclassification) are considered in the ROC. To study the impact of nodes, subfigures (a,b) are plotted. (a) Ten numbers of legitimate and malicious nodes are considered with prior probability for a j-th malicious node. (b) Twenty numbers of legitimate and malicious nodes are considered with prior probability for a j-th malicious node. Further, subfigures (c,d) are plotted to see the impact of transmissions /prior probabilities of malicious nodes. (c) Ten numbers of legitimate and malicious nodes are considered with prior probability for a j-th malicious node. (d) Twenty numbers of legitimate and malicious nodes are considered with prior probability for a j-th malicious node.

is the probability of deciding the i-th Alice node, as any Alice node without i. becomes an important metric when dealing with multiple nodes’ identification. Here, resulted from both transmitter identification algorithms (ML, which is a bi-product of two-step HT-based authentication and the GMM). As the GMM is a learning approach, it requires training data to learn its parameters. That is the reason that we only performed transmitter identification using the GMM. We assumed no data were available for Eve nodes. Figure 7a was generated by assuming actual ground truths (noiseless ) of Alice nodes available for performing ML-based transmitter identification. The ML was implemented using Equation (2) having noiseless ground truths. Figure 7a shows that the two approaches performed equally. To test the efficacy of the GMM approach, we performed another experiment and plotted the results in Figure 7b. This time, we assumed that the ground truths of the Alice nodes were noisy (i.e., when the ground truths were obtained on a secure channel, it also included noise or an error). This time, the ML-based approach was implemented using Equation (2) to include noisy ground truths. The GMM parameters were estimated on training data generated from the legitimate nodes and then tested on . The error was calculated as the number of times the estimated state was not equal to the actual value divided by the total transmissions for both approaches and for both cases. We observed from Figure 7b that the overall performance of GMM was improved. The performance improved even further for lower SNR or higher .

Figure 7

Misclassification error against estimation error for two-step hypothesis testing and GMM. (a) A scenario is considered where the perfect ground truth vector, i.e., is obtained via a secure channel. (b) A scenario is considered where acquired ground truths are noisy. GMM has an advantage over ML when ground truths are noisy.

From Figure 4 and Figure 6, we learned that the path loss could be exploited as a fingerprint to carry out authentication in body-centric nanoscale communication systems operating in the THz band. In other words, the proposed mechanisms can be used as a first line of defense against impersonation attacks. The results of the proposed two-step mechanism can be improved by using an additional approach (i.e., HMM). In particular, at a low SNR, the improvement was quite significant. The results in Figure 4 and Figure 6 indicate that, under the impersonation detection problem, it is not possible to minimize both and at the same time because of their conflicting nature. In other words, one could minimize one error type only by compromising the other error type. GMM (Learning-based scheme) performed the same as our proposed two-step mechanism in transmitter identification. However, we learned that slightly complex nature of the GMM could produce improvements when the ground truths of legitimate nodes are noisy.

7. Conclusions

This paper provided an authentication mechanism using path loss as a fingerprint at the physical layer in body-centric nanoscale communication systems operating in the terahertz band. The work’s importance was advocated by illustrating envisioned smart healthcare application of body-centric nanoscale communication systems. The complex and quantum insecure crypto measures can be complemented using this approach, which is simple and quantum secure (i.e., no encryption or shared secret key is involved). This was observed from ROC curves after doing the Monte Carlo-based simulation for nodes’ deployment under a uniform distribution that with a false rate, the detection probability was almost one when operating with SNR dB. For simulation purpose, nodes were deployed in a 1 m × 1 m area under a uniform distribution, and air was considered as a medium among the nodes, while the path loss was calculated using the HITRAN database.

1 in total

1. Security and eavesdropping in terahertz wireless links.

Authors: Jianjun Ma; Rabi Shrestha; Jacob Adelberg; Chia-Yi Yeh; Zahed Hossain; Edward Knightly; Josep Miquel Jornet; Daniel M Mittleman
Journal: Nature Date: 2018-10-15 Impact factor: 49.962

1 in total