Literature DB >> 23202211

A modular spectrum sensing system based on PSO-SVM.

Zhuoran Cai¹, Honglin Zhao, Zhutian Yang, Yun Mo.

Abstract

In the cognitive radio system, spectrum sensing for detecting the presence of primary users in a licensed spectrum is a fundamental problem. Energy detection is the most popular spectrum sensing scheme used to differentiate the case where the primary user&#8217;s signal is present from the case where there is only noise. In fact, the nature of spectrum sensing can be taken as a binary classification problem, and energy detection is a linear classifier. If the signal-to-noise ratio (SNR) of the received signal is low, and the number of received signal samples for sensing is small, the binary classification problem is linearly inseparable. In this situation the performance of energy detection will decrease seriously. In this paper, a novel approach for obtaining a nonlinear threshold based on support vector machine with particle swarm optimization (PSO-SVM) to replace the linear threshold used in traditional energy detection is proposed. Simulations demonstrate that the performance of the proposed algorithm is much better than that of traditional energy detection.

Entities: Chemical Disease Species

Year: 2012 PMID： 23202211 PMCID： PMC3522964 DOI： 10.3390/s121115292

Source DB: PubMed Journal: Sensors (Basel) ISSN： 1424-8220 Impact factor: 3.576

Introduction

Based on the conventional fixed spectrum allocation policy, most available radio spectra have been assigned to registered users, which lead to a serious waste of spectrum utilization. In fact, recent reports from Federal Communications Commission (FCC) have shown that only 30% of the allocated spectrum in US is fully utilized [1]. Cognitive radio, which enables secondary users to utilize the spectrum when primary users are not occupying it, has been proposed as a promising technology to improve spectrum utilization efficiency [2-4], and has three essential components: (1) Spectrum sensing: the secondary users sense the radio spectrum environment within their operating range to detect the frequency bands which are not occupied by primary users; (2) Dynamic spectrum management: cognitive radio networks dynamically select the best available bands for communication; (3) Adaptive communications: a cognitive radio device can configure its transmission parameters (e.g., carrier frequency, transmission power) to opportunistically make best use of the ever changing available spectrum [5]. Spectrum sensing is a fundamental task for cognitive radio. However, there are several factors that make spectrum sensing practically challenging (e.g., low signal-to-noise ratio (SNR) of primary users, noise uncertainty, multipath fading). Several sensing methods have been proposed, including likelihood ratio test (LRT) [6-8], energy detection method [9,10], match filtering (MF) method [11], cyclostationary detection method [12,13] and the statistical covariances-based method [14]. Each of them has its own advantages and disadvantages, e.g., LRT is proven to be optimal, but it requires exact channel information and distributions of the primary signal and noise. The MF-based method needs perfect knowledge of the channel responses from primary users to the receiver and accurate synchronization (otherwise, its performance will dramatically be reduced) [15], it may not be possible if the primary users do not cooperate with the secondary users. The cyclostationary detection method requires the cyclic frequencies of the primary users, which may not be realistic for many spectrum reuse applications. Furthermore it needs high computation capabilities. The energy detection method does not require any primary signal information and it is robust to unknown dispersed channels and fading, but if the SNR of the received signal is low, the number of received signal samples is small and the power of noise is estimated inaccurately, the energy detection performance will decrease seriously [11]. The covariances-based method also does not require any prior information, but its computation complexity is also high [14]. As mentioned one drawback of the traditional energy detection is that if the SNR of received signal is low and the number of the received signal samples is small the corresponding performance may decrease seriously. In order to overcome this drawback a novel method with the purpose of obtaining a nonlinear threshold for energy detection based on PSO-SVM is proposed in this paper. The proposed method focuses on one single point and one antenna scenario, which can be divided into Offline module and Online module. In the Offline module, the proposed system generates two classes of training samples, one for the simulated situation that both signal and noise exist simultaneously while another case is for noise only. The normalized energy of these two classes of training samples is used as the classification feature to train the PSO-SVM. After each training step, a decision function is generated. In the Online module, the decision functions obtained in the Offline module are used as the nonlinear thresholds for energy detection to verify if the primary user is present. The experimental results show that the receiver operating characteristic (ROC) curve of proposed approach is much better than traditional energy detection. The rest of this paper is organized as follows: the PSO-SVM is introduced in Section 2. In Section 3 determination of threshold and theoretical analysis are proposed. Simulation results are given in Section 4. Conclusions are finally drawn in Section 5.

PSO-SVM

SVM

In this subsection, a brief introduction to SVM, proposed by Vapnik [16], is given. Let (x,y)1≤≤ be a set of N training samples, each sample x ∈ R is a vector, d being the dimension of the input space, belonging to a class labeled by y ∈ {1, −1}. It amounts to find weight vector w and scale b, which satisfy: where 〈·〉 is inner product. The aim of SVM is to find the hyper-plane which makes the samples with the same label on the same side of the hyper-plane. The quantity 1/‖w‖2 is the margin, and optimal separating hyper-plane (OSH) is the separating hyper-plane which maximizes the margin. The larger the margin, the better the generalization is expected [17]. To search the maximum γ, quadratic programming is usually used, leading to: according to Equation (2), a hyper-plane 〈w·x〉 + b = 0 with the largest margin can be obtained. For equality constraints in Equation (2), they can be modified to unconstrained by integrating positive Lagrange multipliers, leading to: to minimize L(w, b, α) by requiring the gradient of L(w, b, α) respect to w and b vanish, a dual form is given by: substituting Equation (4) into Equation (3) gives: According to Equation (3) and Equation (5), it is obvious that , thus the quadratic programming problem in Equation (3) can be converted to: subject to: the third constraint condition in Equation (7) is the Karush-Kuhn-Tucker (KKT) condition [18]. There is a Lagrange multiplier a for each training sample, the training samples for which a > 0 are called “support vectors”, lying on one of the two hyper-planes: 〈w·x+〉 + b = +1; 〈w·x−〉 + b = −1 The training samples will only appear in the form of inner products between vectors. In the nonlinear case, the approach adapted to noisy data is to make a soft margin. We introduce the slack variables ξ ≥ 0, i = 1, 2, …, N so that: The generalized OSH is the solution of minimizing: subject to Equation (7). The parameter is the upper bound of the number of training errors and C is the penalty parameter to control errors. In the nonlinear SVM, a kernel function is introduced to map the initial data into a feature space with a high dimension. In the new space, the data should be linearly separable. Then Equation (6) can be converted to: subject to Equation (7), and 0 ≤ α ≤ C. K(x, x) is the kernel function. As one of the most popular kernel functions, the RBF kernel function is considered in this paper, and it takes the following form: where g is kernel parameter, and denoting the width of kernel function. If g is too big, the SVM may outfit the training data, and a too small g may make SVM algorithm not flexible enough for complex function approximation. By solving Equation (10) we can obtain the minimum 〈w·w〉: where w*, are the solution of Equation (10). Then the decision function is: where x is a test sample with an unknown label y. Noting that w* only depend on the training samples (x, y)1≤≤, if we choose a set of training samples (x, y)1≤≤, a unique decision function will be obtained by solving Equation (10).

Particle Swarm Optimization

The parameters C and g need to be set to solve Equation (10). However, in a practical situation it is difficult to set these two parameters properly. In this subsection, we integrate a PSO (particle swarm optimization) method to adaptively set C and g. Particle Swarm Optimization (PSO) is inspired by the social behavior of birds, birds in a swarm preying on food and cooperation with each other to search for the optimal position to obtain the food. In a PSO optimal problem, there are a group of individuals, each individual is called a “particle”, which may be a potential solution. Suppose that the solution space of optimization problem is D dimensions. The i-th particle is X = (x1, x2, x3, …x), the optimal position for itself is P = (p1, p2, p3, …p) and its speed move to this position is V = (v1, v2, v3, …v), the optimal swarm position is P = (p1, p2, p3, …p), iteration to find the optimal position is given by: where and are the d–th location and speed component of i−th particle at the t−th iteration, c1 and c2 are two positive coefficients, r1 and r2 are the random number selected from 0 to 1, ϖ is the flexible coefficient of . The second part of Equation (12) is called “cognitive” part which is the optimization of the particle itself, and the third part of Equation (12) is the “swarm part”, which is the cooperation with other particles. In addition, a boundary for each particle should be set as, [−x max, x max] and v max. In the process of the iteration, if the parameter of particle is out of the range, they are replaced by the boundary value. For the optimization of the error penalty parameter C and kernel parameter g, each particle is set as (C, g), and a three cross validation is used to estimate the performance of each (C, g), training data is separated into three parts, one part is considered as the validation set and the remaining two parts are for training. Our earlier work has validated this PSO algorithm [19].

Threshold Determination and Theoretical Analysis

Common notation as summarized in Table 1 is used throughout this section.

Table 1.

Notation

N_s	number of signal samples for spectrum sensing
s (n)	actual received primary signal sample
u (n)	actual received noise sample
s_tr (n)	training primary signal sample in offline module
u_tr (n)	training noise sample in offline module
σs2	variance of actual received primary signal
σu2	variance of actual noise
σstr2=λtr	variance of training primary signal in offline module
σutr2=1	variance of training noise in offline module
λ	SNR of actual received primary signal
λ_tr	SNR of training primary signal in offline module
T	actual decision statistic
T₁	actual decision statistic at hypothesis H₁
T₀	actual decision statistic at hypothesis H₀
T'=T/σu2	normalized actual decision statistic
T1′=T1/σu2	normalized actual decision statistic at hypothesis H₁
T0′=T0/σu2	normalized actual decision statistic at hypothesis H₀
T1trNs,λtr	training decision statistic defined by N_s, λ_tr at hypothesis H₁ in offline module
T0trNs	training decision statistic defined by N_s at hypothesis H₀ in offline module

Basic Conception of Energy Detection

In this subsection we introduce the general model for spectrum sensing, then review the energy detection scheme and analyze the relationship between the probability of false alarm and probability of detection. Suppose that we are interested in the frequency band with carrier frequency f, bandwidth W and the received signal is sampled at sampling frequency f, respectively. When the primary user is active, the discrete received signal at the secondary user is given by [5]: where z(n) is under hypothesis H1. When the primary user is inactive, the received signal is given by: and this case is referred to the hypothesis H0. In this paper we only focus on one single point and one antenna spectrum sensing scenario, thus for simplicity we make the following assumptions [5]: (AS1) The primary signal s(n) is an independent, and identically distributed (IID) random process with mean zero and unknown variance ; (AS2) The noise u(n) is a Gaussian IID random process with mean zero and variance , which can be estimated; (AS3) The primary signal s(n) is independent of the noise u(n). The signal-to-noise ratio (SNR) of the actual primary user measured at the secondary receiver of interest is , under hypothesis H1. We consider the circularly symmetric complex Gaussian (CSCG) as the noise case. For the primary signal s(n), we consider complex PSK modulated signal. Two probabilities are of interest for spectrum sensing: probability of detection P, which defines, at hypothesis H1, the probability of sensing method correctly detecting the presence of primary user; and probability of false alarm P, which defines, at hypothesis H0, the probability of sensing method claiming the presence of primary user. Energy detection is one of the most popular spectrum sensing schemes, because of it does not need any prior information about the primary signal and is easy to apply. Let τ be the available sensing time and N be the number of samples, for simplicity we assume N = τf. The test statistics are given by [5]: for generality, decision statistics need to be normalized with the estimated power of noise, then the normalized decision statistics are given by: if the number of signal samples N is small, the probability of false alarm P for a predefined threshold ε is given by: The far right-hand side of Equation (20) indicates a class of chi-square variable with 2N degrees of freedom for complex-valued case. From Equation (20) the threshold ε is related to the P as ε = chi2−1(P, 2N), where chi2−1(·) is inverse of the chi-square cumulative distribution function. For the same threshold ε, the probability of detection P is given by: The symbol indicates a class of non-central chi-square variable with 2N degrees of freedom and a non-centrality parameter λ, in our case, , extensive tables exist for the chi-square distribution, but the non-central chi-square has not been as extensively tabulated. In this paper, we use approximations proposed by Patnaik [20] to replace the non-central chi-square with a central chi-square having a different number of degrees of freedom and a modified threshold level, If the non-central chi-square variable has 2N degrees of freedom and non-centrality parameter λ, define a modified number of degrees of freedom D and a threshold divisor G given by: then: As mentioned above, when the probability of false alarm P and the number of samples N is set, we can obtain a unique value of threshold ε, it equals to a linear classifier in binary classification problem. But if the SNR λ is low while the number of signal samples N is small, the corresponding spectrum sensing problem is linearly inseparable, and the traditional energy detection can not classify this linearly inseparable problem efficiently.

Nonlinear Threshold System

To overcome the drawbacks of traditional energy detection mentioned above, the authors here propose a method in purpose of obtaining nonlinearly threshold based on PSO-SVM. The system process encompasses two distinct modules i.e., Offline and Online, which are clearly illustrated by the Figure 1.

Figure 1.

System model of the proposed method.

Offline Module

The main function of the Offline Module is to generate the nonlinear thresholds for energy detection. Firstly, the proposed system generates training signal and training noise under AS1-3. Therefore the variance of training signal is a known value , the variance of training noise is , and training SNR is λ. Secondly, based on the parameter λ and number of signal samples N, two classes of training decision statistics under hypothesis H0 and hypothesis H1 could be obtained, which are given by: where the symbol indicates a class of non-central chi-square variables with mean 1+λ 2N degrees of freedom, and a non-centrality parameter λ. While the symbol indicates a class of central chi-square variables with mean 1 and 2N degrees of freedom. Thirdly, labeling each variable of class as “+1”, and each variable of class as “−1”, then these two classes of variables are used as training data to train PSO-SVM mentioned in Section 2. Consequently, a separating hyper-plane 〈w*·x 〉 + b = 0 and a decision function f(x) = sign(〈w*·x 〉 + b)could be derived. In the fourth step, the variables of class are applied to test this decision function so as to gain the probability (denoted as ) that the decision function mistakenly label a variable as a variable.

Proposition1

The probability of a variable mistakenly labeled by decision function is determined by the geometric distance γ which is from this variable to the separating hyper-plane.

Proof

The proof is mainly based on Rosenblatt classifier, and detailed proof is given in Appendix A. The average geometric distance from variables of class to separating hyper-plane is given by: while the average geometric distance from variables of class to separating hyper-plane is given by: according to Equation (26) and Equation (27), it is obvious that the average geometric distance from variables of class to the separating hyper-plane is equal to the average geometric distance from variables of class to the separating hyper-plane, and the probability distribution of class are same as the probability distribution of class. Then, based on proposition 1, the probability that a variable of class mistakenly labeled as class is equal to the probability (the probability of false alarm) that a variable of class is mistakenly labeled as class. To conclude, it could be expressed as equation: Each set of parameter λ and N will be used to generate two classes of variable: and . A decision function marked as f(x) could be obtained by training PSO-SVM with these two classes of data. Finally, the decision function f(x) is stored as a non-linear threshold. The process to obtain a non-linear threshold is shown in Table 2, and some of typical training results of f(x) are shown in Table 3 and Figure 2.

Table 2.

The process to obtain a non-linear threshold.

1. Generate training signal s_tr (n) and training noise u_tr (n), with σstr2=λtr, σstr2=1.

2. Compute two classes of data: T1trNs,λtr, T0trNs by (24) and (25).

3. Train PSO-SVM with two classes of data: T1trNs,λtr, T0trNs to obtain a decision function f (x).

4. Test this f (x) with the variables of T0trNs class to obtain PeNs,λtr, based on Proposition 1 PeNs,λtr=Pf.

5: Return f (x) as f_{N_s, P_f}, and store it as non-linear threshold.

Table 3.

Typical Training Results.

N_s = 5			N_s = 10
λ_tr	P_f	f_{N_s, P_f}(x)	λ_tr	P_f	f_{N_s, P_f}(x)
−35.7 dB	0.9	f_5,0.9(x)	−37.1 dB	0.9	f_10,0.9(x)
−29.8 dB	0.8	f_5,0.8(x)	−32.8 dB	0.8	f_10,0.8(x)
−24.9 dB	0.7	f_5,0.7(x)	−27.2 dB	0.7	f_10,0.7(x)
−20.4 dB	0.6	f_5,0.6(x)	−21.6 dB	0.6	f_10,0.6(x)
−16.5 dB	0.5	f_5,0.5(x)	−18.4 dB	0.5	f_10,0.5(x)
−6.5 dB	0.4	f_5,0.4(x)	−8.1 dB	0.4	f_10,0.4(x)
−2.8 dB	0.3	f_5,0.3(x)	−4.2 dB	0.3	f_10,0.3(x)
−0.2 dB	0.2	f_5,0.2(x)	−1.7 dB	0.2	f_10,0.2(x)
2.3 dB	0.1	f_5,0.1(x)	−0.2 dB	0.1	f_10,0.1(x)

Figure 2.

Training Results of Table 3.

Online Module

In the Online module, the proposed system automatically chooses one of the decision functions (according to required number of signal samples N and probability of false alarm P) stored in the Offline Module as non-linear threshold to judge whether the actual primary user is present e.g., the required number of signal samples is N = 5 and probability of false alarm is P = 0.1, the proposed system would apply decision function f5,0.1(x) as the non-linear threshold. If a decision function f(x) is chosen as the nonlinear threshold, Equations (20) and (21) will be converted to: The result of spectrum sensing is given by:

Comparison with Traditional Energy Detection

Energy detection is the basic sensing method, which was first proposed in [9] and further studied in [5,10]. It does not need any information of the signal to be detected and is robust to unknown dispersive channels. Energy detection compares the normalized average power of the actual received signal plus noise variable with the noise power to make a decision. To guarantee a reliable detection, the threshold must be set according to the actual noise power and the number of samples N [9]. The difference between the traditional Energy detection and the proposed system is that the proposed system has a Offline module to obtain decision functions as the non-linear thresholds. In the Offline module, the system needs a great number of variables and variables to train PSO-SVM during each training process. Taking the simulation process in this article as an instance, 500 variables from and the same from are deployed for each training process, of which computational complexity is about O(10003) [16]. As the price of getting the full list of decision functions, the training times are huge. Therefore the overall computational complexity of the offline module is extremely high. However, in a real spectrum sensing situation, we only take care of the computational complexity in the Online module. The computational complexity of traditional Energy detection needs about N multiplications and additions. Hence, the computational complexity of the proposed methods is about N + 1 multiplications and additions, which is competitive with traditional Energy Detection.

Results and Discussion

In this section, we use the decision functions stored in the Offline module as the nonlinear thresholds to simulate the probability of detection P in the Online module. In Figure 3 we compare the receiver operating characteristic curve of the proposed method and traditional energy detection for actual SNR λ = 0dB and number of signal samples N = 5. If the SNR of the actual received signal is low and the number of signal samples is small e.g., λ = 0 dB and N = 5, the corresponding spectrum sensing problem based on energy detection is a linearly inseparable binary classification problem. Traditional energy detection with a predefined threshold is a linear classifier, it cannot solve linearly inseparable problem efficiently. But the proposed method is a nonlinear classifier based on PSO-SVM, which can solve linearly inseparable problem efficiently. As shown in Figure 3, the performance of the proposed method is much better than traditional energy detection.

Figure 3.

Receiver operating characteristic curve of the proposed method and traditional energy detection for number of signal samples N = 5 and actual SNR λ = 0 dB.

In Figure 4 we compare the receiver operating characteristic curve of the proposed method and traditional energy detection in terms of actual SNR λ = 5 dB and number of signal samples N = 5. Although the actual SNR λ increases to 5 dB but the number of signal samples N = 5 is small, which means the corresponding spectrum sensing problem based on energy detection is still linearly inseparable. Therefore, as shown in Figure 4, the performance of proposed method is dramatically better than the traditional energy detection method.

Figure 4.

Receiver operating characteristic curve of the proposed method and traditional energy detection for P for number of signal samples N = 5 and actual SNR λ = 5 dB.

In Figure 5 we compare P of the proposed method and traditional energy detection for fixed P = 0.1 and number of signal samples N = 1,000, the corresponding performance of the proposed method is still better than energy detection. This is because although the number of sensing samples is large i.e., 1000, but actual SNR λ is low, thus the corresponding spectrum sensing problem is linearly inseparable, however the proposed method can classify linearly inseparable problem efficiently. As shown in Figure 5, at λ = −20 dB the proposed method is almost three times better than traditional energy detection. With the λ increases the spectrum sensing problem coverage to linearly separable, and the difference of performance between the proposed method and the traditional energy detection also decreases.

Figure 5.

P of the proposed method and traditional energy detection for number of signal samples, N = 1,000, P = 0.1 and actual SNR λ = −20 dB −0 dB.

Conclusions

In this paper, a novel modular spectrum sensing method for cognitive radio based on PSO-SVM is proposed. It comprises two distinct modules, i.e., Offline and Online. In the Offline module, the decision functions with associated probabilities of false alarm are obtained. In the Online module, the primary user is detected by using the decision functions obtained in the Offline module. The proposed method actually is independent from the traditional detection method, a nonlinear decision is exploited to replace the linear threshold, which drastically improves the performance of detection without increasing the computational complexity in Online phase. The approach can be used for various signal detection applications without a priori knowledge of signals and channels. Simulations have been carried to evaluate the performance of the proposed method. It has been shown that the proposed approach is more effective than the traditional energy detection approach in hostile environments. More specifically, when the received signal samples are lacking and SNR is low, the approach proposed in this paper can give a reliable performance, while the traditional energy detection approach is hypodynamic. In our future research work, we will try to apply the proposed method to enhance more sophisticated detection algorithm which uses predefined linear threshold (e.g., the method proposed in [14]). More specifically, secondary users are located with detected radio map thereby deploying space-time spectrum sensing. And the specific signal pattern from the primary user can be recognized by analyzing the signal detected.

2 in total

1. Support vector machines for histogram-based image classification.

Authors: O Chapelle; P Haffner; V N Vapnik
Journal: IEEE Trans Neural Netw Date: 1999

2. The non-central chi2- and F-distributions and their applications.

Authors: P B PATNAIK
Journal: Biometrika Date: 1949-06 Impact factor: 2.445

2 in total

1. A spatial division clustering method and low dimensional feature extraction technique based indoor positioning system.

Authors: Yun Mo; Zhongzhao Zhang; Weixiao Meng; Lin Ma; Yao Wang
Journal: Sensors (Basel) Date: 2014-01-22 Impact factor: 3.576

2. Support Vector Machine Optimized by Genetic Algorithm for Data Analysis of Near-Infrared Spectroscopy Sensors.

Authors: Di Wang; Lin Xie; Simon X Yang; Fengchun Tian
Journal: Sensors (Basel) Date: 2018-09-25 Impact factor: 3.576

2 in total